queenp's blog

Posted Sat 28 May 2016

Defcon Qualifier 2016: baby-re writeup

Defcon CTF qualifiers happened a week ago and had lots of interesting challenges generally. I've been slowly getting more into doing CTFs since autumn last year. I can't make claims as to their relevance to practical infosec (real world systems aren't usually deliberately flawed, real world flaws can be worse, and more obvious!) but I've been lucky enough to get into that field and I'm finding it pretty exciting and challenges help with picking up new skills.


The original challenge is down now, but the source for it is available here (although it wasn't available during the challenge).

Running it to get a rough idea what it's doing gives you:

$ ./baby-re
Var[0]: 1
Var[1]: 2
Var[2]: 3
Var[3]: 4
Var[4]: 5
Var[5]: 6
Var[6]: 7
Var[7]: 8
Var[8]: 9
Var[9]: 10
Var[10]: 11
Var[11]: 12
Var[12]: 13


;-- main:
; (fcn) sym.main 906
; var int local_20h @ rbp-0x20
; var int local_60h @ rbp-0x60
; DATA XREF from 0x004005ed (entry0)
0x004025e7      55             push rbp
0x004025e8      4889e5         mov rbp, rsp
0x004025eb      4156           push r14
0x004025ed      4155           push r13
0x004025ef      4154           push r12
0x004025f1      53             push rbx
0x004025f2      4883ec40       sub rsp, 0x40
0x004025f6      64488b042528.  mov rax, qword fs:[0x28]    ; [0x28:8]=0x3b08 ; '('
0x004025ff      488945d8       mov qword [rbp - local_28h], rax
0x00402603      31c0           xor eax, eax
0x00402605      bf082a4000     mov edi, str.Var_0_:        ; "Var[0]: " @ 0x402a08
0x0040260a      b800000000     mov eax, 0
0x0040260f      e86cdfffff     call sym.imp.printf
0x00402614      488b053d0a20.  mov rax, qword [rip + 0x200a3d] ; [0x603058:8]=0x625528203a434347 LEA obj.__TMC_END__ ; "GCC: (Ubuntu 5.3.1-14ubuntu2) 5.3.1 20160413" @ 0x603058
0x0040261b      4889c7         mov rdi, rax
0x0040261e      e87ddfffff     call sym.imp.fflush
0x004028e0      e8e1ddffff     call sym.CheckSolution
0x004028e5      84c0           test al, al
0x004028e7      7458           je 0x402941
0x004028e9      448b65d0       mov r12d, dword [rbp - local_30h]
0x0040292c      bf882a4000     mov edi, str.The_flag_is:__c_c_c_c_c_c_c_c_c_c_c_c_c_n ; "The flag is: %c%c%c%c%c%c%c%c%c%c%c%c%c." @ 0x402a88
0x00402931      b800000000     mov eax, 0
0x00402936      e845dcffff     call sym.imp.printf
0x0040293b      4883c440       add rsp, 0x40
0x0040293f      eb0a           jmp 0x40294b
0x00402941      bfb12a4000     mov edi, str.Wrong          ; "Wrong" @ 0x402ab1
0x00402946      e815dcffff     call sym.imp.puts
; JMP XREF from 0x0040293f (sym.main)
0x0040294b      b800000000     mov eax, 0

So following the jump tree, this is a pretty classic input validator crackme (where the correct inputs generate a valid value to pass a test). Whatever we're inputting has to pass CheckSolution(). By crackme standards, it's at the "very legible" end of things, for this stage, so if you're getting into reverse engineering, it's probably a good idea to work through the instructions and understand what's going on.

[0x004005d0]> pdf@sym.CheckSolution
Do you want to print 2313 lines? (y/N)

Even at a quick glance, CheckSolution() is a long and complex stream of jumps and computations:

add rax, 0x10
mov eax, dword [rax]
imul eax, edx
sub ecx, eax
mov edx, dword [rbp-local_38]
mov rax, qword [rbp-local_87]
add rax, 0x14
mov eax, dword [rax]
imul eax, edx
sub ecx, eax
mov edx, dword [rbp-local_37_4]
mov rax, qword [rbp-local_87]
add rax, 0x18
mov eax, dword [rax]
imul eax, edx
add ecx, eax
mov edx, dword [rbp-local_37]
mov rax, qword [rbp-local_87]
add rax, 0x1c
mov eax, dword [rax]
imul eax, edx
add ecx, eax
mov edx, dword [rbp-local_36_4]
mov rax, qword [rbp-local_87]

For good measure it's also riddled up and down with invalid anti-debugger opcodes and so on. So that's nice.

Having learned my lesson from the BSides Manchester crackme I boldly quit trying to read assembly and looked for a better way. Luckily enough someone suggested I ought to look into Angr, which turned out to be exactly what I needed.

Using angr

angr is a binary analysis framework that, amongst other features to numerous to mention, analyses the behaviour of binaries within a symbolic execution environment allowing it to reduce inputs and outputs for sufficiently simple programs to equations that can be solved.

In particular you can give angr success and failure conditions and make it search for an execution path (including the valid inputs and outputs) which satisfy constraints under execution.

Looking back at the original main function, I took the address shortly after the flag is was dumped as a success, and the address at the start of the failure notice being printed as a failure signal.

Python solution script

#!/usr/bin/env python
import angr
import struct

# Immediately after failing sym.CheckSolution
wrong = 0x00402941

# After printing the flag to stdout
success = 0x0040293f

proj = angr.Project("./baby-re", load_options={'auto_load_libs':False})

# So far as I understand it, a surveyor has a particular strategy for
# finding a path to the success criteria.
ex = proj.surveyors.Explorer(find=(success,), avoid=(wrong,))

if ex.found:
    f = ex.found[0].state

    # Print output from STDOUT

This felt unbelievably simple. I guess that was the intention given the name. Really glad to have slightly improved my ctf tool knowledge. Given how it works, it's possible for complexity to completely spiral out of hand with particular sorts of problem. A few things I've tried it on since consume 8gb of memory pretty comfortably in a few minutes and then crash. Still, seems really worth digging into and getting a better feel for.

Category: writeups
Tags: reversing easy smt