Debugging a stripped binaries can be really tedious (especially when one can’t even find out where to begin). I’ll talk a bit about the very basics of dealing with the pain of stripped binaries.
Example Code:
Let’s compile this program with gcc using -s flag to strip the binary and the -m32 flag for 32-bit.
gcc -s example.c -o example -m32
Usually, the first step in disassembling a binary in gdb is setting a break point at main (break main). However, you might face this:
eniac@faisal:~$ gdb -q example
Reading symbols from example...(no debugging symbols found)...done.
(gdb) set disassembly-flavor intel
(gdb) break main
Function "main" not defined.
Make breakpoint pending on future shared library load? (y or [n])
This is usually an indicator of a stripped binary. Also, running (file) on the binary shows that the binary is stripped:
eniac@faisal:~$ file example-stripped
example-stripped: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=599acec55a49a45969677bb10a255734a63fe6e0, stripped
When a program is compiled, the compiler (gcc in this case) adds extra information to the binary called debugging symbols. These symbols makes it easier to debug a program. A stripped binary is a program that is compiled with a strip flag that tells the compiler to discard these debugging symbols and compile to program as it is. Stripping a binary reduces its size on the disk and makes it a little more difficult to debug and reverse engineer. The difference in size between a stripped and a non-stripped binary can be observed:
eniac@faisal:~$ ls -lh
total 16K
-rwxrwxrwx 1 eniac eniac 7.3K Apr 26 10:49 example
-rw-rw-rw- 1 eniac eniac 195 Apr 26 09:50 example.c
-rwxrwxrwx 1 eniac eniac 5.5K Apr 26 10:48 example-stripped
Even though 1.8 KB don’t seem like much but on large binaries the difference is much bigger.
Finding the Entry Point
The first logical step would be to find the entry point. This can be done by running “info file” in gdb.
(gdb) info file
Symbols from "/home/eniac/example-stripped".
Local exec file:
`/home/eniac/example-stripped', file type elf32-i386.
Entry point: 0x80483a0
0x08048154 - 0x08048167 is .interp
0x08048168 - 0x08048188 is .note.ABI-tag
0x08048188 - 0x080481ac is .note.gnu.build-id
0x080481ac - 0x080481cc is .gnu.hash
0x080481cc - 0x0804824c is .dynsym
0x0804824c - 0x080482a3 is .dynstr
0x080482a4 - 0x080482b4 is .gnu.version
0x080482b4 - 0x080482d4 is .gnu.version_r
0x080482d4 - 0x080482dc is .rel.dyn
0x080482dc - 0x08048304 is .rel.plt
0x08048304 - 0x08048327 is .init
0x08048330 - 0x08048390 is .plt
0x08048390 - 0x08048398 is .plt.got
0x080483a0 - 0x08048552 is .text
0x08048554 - 0x08048568 is .fini
0x08048568 - 0x0804858c is .rodata
0x0804858c - 0x080485b8 is .eh_frame_hdr
0x080485b8 - 0x08048684 is .eh_frame
0x08049f08 - 0x08049f0c is .init_array
0x08049f0c - 0x08049f10 is .fini_array
0x08049f10 - 0x08049f14 is .jcr
0x08049f14 - 0x08049ffc is .dynamic
0x08049ffc - 0x0804a000 is .got
0x0804a000 - 0x0804a020 is .got.plt
0x0804a020 - 0x0804a028 is .data
0x0804a028 - 0x0804a02c is .bss
We can see that the entry point is at 0x80483a0. Let’s set a break point there and run the program.
(gdb) break *0x400500
Breakpoint 1 at 0x400500
(gdb) r
Starting program: /home/eniac/example-strippedBreakpoint 1, 0x0000000000400500 in ?? ()
Great. Let’s disassemble this region.
(gdb) disassemble
No function contains program counter for selected frame.
Disassembling a Region
We can’t simply use disassemble to disassemble this region. This is because gdb doesn’t know the range of the function we want to disassemble. We can overcome this by printing a number of instructions starting from the program counter as the program counter points to the next instruction to execute. Let’s use x/15i $eip which will print 15 instruction starting from eip.
(gdb) x/15i $eip
=> 0x80483a0: xor ebp,ebp
0x80483a2: pop esi
0x80483a3: mov ecx,esp
0x80483a5: and esp,0xfffffff0
0x80483a8: push eax
0x80483a9: push esp
0x80483aa: push edx
0x80483ab: push 0x8048550
0x80483b0: push 0x80484f0
0x80483b5: push ecx
0x80483b6: push esi
0x80483b7: push 0x804849b
0x80483bc: call 0x8048370 <__libc_start_main@plt>
0x80483c1: hlt
0x80483c2: xchg ax,ax
Keep in mind that this is the entry point and not the main() function. Now we have to find where on earth is main(). At instruction 0x80483bc we see a call to (__libc_start_main@pl). A quick look into the documentation shows that this function initialises the process and calls main with the appropriate parameters.
Name
__libc_start_main — initialization routine
Synopsis
int __libc_start_main(int *(main) (int, char * *, char * *), int argc, char * * ubp_av, void (*init) (void), void (*fini) (void), void (*rtld_fini) (void), void (* stack_end));Description
The __libc_start_main() function shall initialize the process, call the main function with appropriate arguments, and handle the return from main().__libc_start_main() is not in the source standard; it is only in the binary standard.http://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/baselib---libc-start-main-.html
We can see that the first parameter is a pointer to the main() function “int *(main)”. Looking back at our disassembled entry point:
(gdb) x/15i $eip
=> 0x80483a0: xor ebp,ebp
0x80483a2: pop esi
0x80483a3: mov ecx,esp
0x80483a5: and esp,0xfffffff0
0x80483a8: push eax
0x80483a9: push esp
0x80483aa: push edx
0x80483ab: push 0x8048550
0x80483b0: push 0x80484f0
0x80483b5: push ecx
0x80483b6: push esi
0x80483b7: push 0x804849b
0x80483bc: call 0x8048370 <__libc_start_main@plt>
0x80483c1: hlt
0x80483c2: xchg ax,ax
We can see that the pointer to main() is 0x804849b which is pushed on the stack right before calling __libc_start_main@plt (since it is the first parameters and arguments are pushed on the stack in reverse order). Let’s set a break point at this address and continue the execution.
(gdb) break *0x804849b
Breakpoint 3 at 0x804849b
(gdb) c
Continuing.
Breakpoint 3, 0x0804849b in ?? ()
Let’s disassemble the region using the same x/15i $eip command.
=> 0x804849b: lea ecx,[esp+0x4]
0x804849f: and esp,0xfffffff0
0x80484a2: push DWORD PTR [ecx-0x4]
0x80484a5: push ebp
0x80484a6: mov ebp,esp
0x80484a8: push ecx
0x80484a9: sub esp,0x14
0x80484ac: sub esp,0xc
0x80484af: push 0x0
0x80484b1: call 0x8048350 <time@plt>
0x80484b6: add esp,0x10
0x80484b9: sub esp,0xc
0x80484bc: push eax
0x80484bd: call 0x8048360 <srand@plt>
0x80484c2: add esp,0x10
Nice, we can see the main() prologue and the call to time() at 0x80484b1 and srand() at 0x80484bd. One nice trick is to define a hook-stop that will print the next 15 instruction from the program counter everytime a step is taken.
(gdb) define hook-stop
Type commands for definition of "hook-stop".
End with a line saying just "end".
>x/15i $eip
>end
(gdb) ni
=> 0x804849f: and esp,0xfffffff0
0x80484a2: push DWORD PTR [ecx-0x4]
0x80484a5: push ebp
0x80484a6: mov ebp,esp
0x80484a8: push ecx
0x80484a9: sub esp,0x14
0x80484ac: sub esp,0xc
0x80484af: push 0x0
0x80484b1: call 0x8048350 <time@plt>
0x80484b6: add esp,0x10
0x80484b9: sub esp,0xc
0x80484bc: push eax
0x80484bd: call 0x8048360 <srand@plt>
0x80484c2: add esp,0x10
0x80484c5: call 0x8048380 <rand@plt>
0x0804849f in ?? ()
(gdb) ni
=> 0x80484a2: push DWORD PTR [ecx-0x4]
0x80484a5: push ebp
0x80484a6: mov ebp,esp
0x80484a8: push ecx
0x80484a9: sub esp,0x14
0x80484ac: sub esp,0xc
0x80484af: push 0x0
0x80484b1: call 0x8048350 <time@plt>
0x80484b6: add esp,0x10
0x80484b9: sub esp,0xc
0x80484bc: push eax
0x80484bd: call 0x8048360 <srand@plt>
0x80484c2: add esp,0x10
0x80484c5: call 0x8048380 <rand@plt>
0x80484ca: mov DWORD PTR [ebp-0xc],eax
0x080484a2 in ?? ()
We just disassembled a stripped binary!