Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the linker find the main function?

How does the linker find the main function in an x86-64 ELF-format executable?

like image 366
RouteMapper Avatar asked Jul 17 '13 19:07

RouteMapper


1 Answers

A very generic overview, the linker assigns the address to the block of code identified by the symbol main. As it does for all the symbols in your object files.

Actually, it doesn't assign a real address but assigns an address relative to some base which will get translated to a real address by the loader when the program is executed.

The actual entry point is not likely main but some symbol in the crt that calls main. LD by default looks for the symbol start unless you specify something different.

The linked code ends up in the .text section of the executable and could look something like this (very simplified):

Address | Code
1000      someFunction
...
2000      start
2001        call 3000
...
3000      main
...

When the linker writes the ELF header it would specify the entry point as address 2000.

You can get the relative address of main by dumping the contents of the executable with something like objdump. To get the actual address at runtime you can just read the symbol funcptr ptr = main; where funcptr is defined as a pointer to a function with the signature of main.

typedef int (*funcptr)(int argc, char* argv[]);

int main(int argc, char* argv[])
{
    funcptr ptr = main;
    printf("%p\n", ptr);
    return 0;
}

The address of main will be correctly resolved regardless if symbols have been stripped since the linker will first resolve the symbol main to its relative address.

Use objdump like this:

$ objdump -f funcptr.exe 

funcptr.exe:     file format pei-i386
architecture: i386, flags 0x0000013a:
EXEC_P, HAS_DEBUG, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00401000

Looking for main specifically, on my machine I get this:

$ objdump -D funcptr.exe | grep main
  40102c:       e8 af 01 00 00          call   4011e0 <_cygwin_premain0>
  401048:       e8 a3 01 00 00          call   4011f0 <_cygwin_premain1>
  401064:       e8 97 01 00 00          call   401200 <_cygwin_premain2>
  401080:       e8 8b 01 00 00          call   401210 <_cygwin_premain3>
00401170 <_main>:
  401179:       e8 a2 00 00 00          call   401220 <___main>
004011e0 <_cygwin_premain0>:
004011f0 <_cygwin_premain1>:
00401200 <_cygwin_premain2>:
00401210 <_cygwin_premain3>:
00401220 <___main>:

Note that I am on Windows using Cygwin so your results will differ slightly. It looks like main lives at 00401170 for me.

like image 160
Dave Rager Avatar answered Sep 28 '22 21:09

Dave Rager