Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the main function's entry point of elf executable file without any symbolic information?

Tags:

linux

reverse

elf

I developed a small cpp program on platform of Ubuntu-Linux 11.10. Now I want to reverse engineer it. I am beginner. I use such tools: GDB 7.0, hte editor, hexeditor.

For the first time I made it pretty easy. With help of symbolic information I founded the address of main function and made everything I needed. Then I striped (--strip-all) executable elf-file and I have some problems. I know that main function starts from 0x8960 in this program. But I haven't any idea how should I find this point without this knowledge. I tried debug my program step by step with gdb but it goes into __libc_start_main then into the ld-linux.so.3 (so, it finds and loads the shared libraries needed by a program). I debugged it about 10 minutes. Of course, may be in 20 minutes I can reach the main function's entry point, but, it seems, that more easy way has to exist.

What should I do to find the main function's entry point without any symbolic info? Could you advise me some good books/sites/other_sources from reverse engineering of elf-files with help of gdb? Any help would be appreciated.

like image 467
Lucky Man Avatar asked Mar 27 '12 08:03

Lucky Man


People also ask

What is the entry point in ELF file?

The initial entry point for an image is a single value that is stored in the ELF header file. For programs loaded into RAM by an operating system or boot loader, the loader starts the image execution by transferring control to the initial entry point in the image. An image can have only one initial entry point.

How do I decode an ELF file?

you can use readelf and objdump to read parts of an elf file. You can also use 'hexdump filename' to get a hexdump of the contents of a binary file (this is likely only useful if you like reading machine code or you are writing an assembler).

What is the name of the first function that is always run in an ELF binary?

Loading an ELF binary is handled by the load_elf_binary() function, which starts by examining the ELF header to check that the file in question does indeed look like a supported ELF format.

What is entry point address ELF?

The entry point address specified in the ELF header is 0x8048170, which falls inside the region containing program code.


2 Answers

Locating main() in a stripped Linux ELF binary is straightforward. No symbol information is required.

The prototype for __libc_start_main is

int __libc_start_main(int (*main) (int, char**, char**), 
                      int argc, 
                      char *__unbounded *__unbounded ubp_av, 
                      void (*init) (void), 
                      void (*fini) (void), 
                      void (*rtld_fini) (void), 
                      void (*__unbounded stack_end));

The runtime memory address of main() is the argument corresponding to the first parameter, int (*main) (int, char**, char**). This means that the last memory address saved on the runtime stack prior to calling __libc_start_main is the memory address of main(), since arguments are pushed onto the runtime stack in the reverse order of their corresponding parameters in the function definition.

One can enter main() in gdb in 4 steps:

  1. Find the program entry point
  2. Find where __libc_start_main is called
  3. Set a break point to the address last saved on stack prior to the call to _libc_start_main
  4. Let program execution continue until the break point for main() is hit

The process is the same for both 32-bit and 64-bit ELF binaries.

Entering main() in an example stripped 32-bit ELF binary called "test_32":

$ gdb -q -nh test_32
Reading symbols from test_32...(no debugging symbols found)...done.
(gdb) info file                                  #step 1
Symbols from "/home/c/test_32".
Local exec file:
    `/home/c/test_32', file type elf32-i386.
    Entry point: 0x8048310
    < output snipped >
(gdb) break *0x8048310
Breakpoint 1 at 0x8048310
(gdb) run
Starting program: /home/c/test_32 

Breakpoint 1, 0x08048310 in ?? ()
(gdb) x/13i $eip                                 #step 2
=> 0x8048310:   xor    %ebp,%ebp
   0x8048312:   pop    %esi
   0x8048313:   mov    %esp,%ecx
   0x8048315:   and    $0xfffffff0,%esp
   0x8048318:   push   %eax
   0x8048319:   push   %esp
   0x804831a:   push   %edx
   0x804831b:   push   $0x80484a0
   0x8048320:   push   $0x8048440
   0x8048325:   push   %ecx
   0x8048326:   push   %esi
   0x8048327:   push   $0x804840b                # address of main()
   0x804832c:   call   0x80482f0 <__libc_start_main@plt>
(gdb) break *0x804840b                           # step 3
Breakpoint 2 at 0x804840b
(gdb) continue                                   # step 4 
Continuing.

Breakpoint 2, 0x0804840b in ?? ()                # now in main()
(gdb) x/x $esp+4
0xffffd110: 0x00000001                           # argc = 1
(gdb) x/s **(char ***) ($esp+8)
0xffffd35c: "/home/c/test_32"                    # argv[0]
(gdb)

Entering main() in an example stripped 64-bit ELF binary called "test_64":

$ gdb -q -nh test_64
Reading symbols from test_64...(no debugging symbols found)...done.
(gdb) info file                                  # step 1
Symbols from "/home/c/test_64".
Local exec file:
    `/home/c/test_64', file type elf64-x86-64.
    Entry point: 0x400430
    < output snipped >
(gdb) break *0x400430
Breakpoint 1 at 0x400430
(gdb) run 
Starting program: /home/c/test_64 

Breakpoint 1, 0x0000000000400430 in ?? ()
(gdb) x/11i $rip                                 # step 2
=> 0x400430:    xor    %ebp,%ebp
   0x400432:    mov    %rdx,%r9
   0x400435:    pop    %rsi
   0x400436:    mov    %rsp,%rdx
   0x400439:    and    $0xfffffffffffffff0,%rsp
   0x40043d:    push   %rax
   0x40043e:    push   %rsp
   0x40043f:    mov    $0x4005c0,%r8
   0x400446:    mov    $0x400550,%rcx
   0x40044d:    mov    $0x400526,%rdi            # address of main()
   0x400454:    callq  0x400410 <__libc_start_main@plt>
(gdb) break *0x400526                            # step 3
Breakpoint 2 at 0x400526
(gdb) continue                                   # step 4
Continuing.

Breakpoint 2, 0x0000000000400526 in ?? ()        # now in main()
(gdb) print $rdi                                    
$3 = 1                                           # argc = 1
(gdb) x/s **(char ***) ($rsp+16)
0x7fffffffe35c: "/home/c/test_64"                # argv[0]
(gdb) 

A detailed treatment of program initialization and what occurs before main() is called and how to get to main() can be found be found in Patrick Horgan's tutorial "Linux x86 Program Start Up or - How the heck do we get to main()?"

like image 62
julian Avatar answered Oct 03 '22 09:10

julian


If you have a very stripped version, or even a binary that is packed, as using UPX, you can gdb on it in the tough way as:

$ readelf -h echo | grep Entry
Entry point address:               0x103120

And then you can break at it in GDB as:

$ gdb mybinary
(gdb) break * 0x103120
Breakpoint 1 at 0x103120gdb) 
(gdb) r
Starting program: mybinary 
Breakpoint 1, 0x0000000000103120 in ?? ()

and then, you can see the entry instructions:

(gdb) x/10i 0x0000000000103120
=> 0x103120:    bl      0x103394
  0x103124: dcbtst  0,r5
  0x103128: mflr    r13
  0x10312c: cmplwi  r7,2
  0x103130: bne     0x103214
  0x103134: stw     r5,0(r6)
  0x103138: add     r4,r4,r3
  0x10313c: lis     r0,-32768
  0x103140: lis     r9,-32768
  0x103144: addi    r3,r3,-1

I hope it helps

like image 24
Breno Leitão Avatar answered Oct 03 '22 09:10

Breno Leitão