I first noticed it while playing with GDB's rbreak .
, and then made a minimal example:
(gdb) file hello_world.out
Reading symbols from hello_world.out...done.
(gdb) b _init
Breakpoint 1 at 0x4003e0
(gdb) b _start
Breakpoint 2 at 0x400440
(gdb) run
Starting program: /home/ciro/bak/git/cpp/cheat/gdb/hello_world.out
Breakpoint 1, _init (argc=1, argv=0x7fffffffd698, envp=0x7fffffffd6a8) at ../csu/init-first.c:52
52 ../csu/init-first.c: No such file or directory.
(gdb) continue
Continuing.
Breakpoint 2, 0x0000000000400440 in _start ()
(gdb) continue
Continuing.
Breakpoint 1, 0x00000000004003e0 in _init ()
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y <MULTIPLE>
breakpoint already hit 2 times
1.1 y 0x00000000004003e0 <_init>
1.2 y 0x00007ffff7a36c20 in _init at ../csu/init-first.c:52
2 breakpoint keep y 0x0000000000400440 <_start>
breakpoint already hit 1 time
Note that there are 2 _init
: one in csu/init-first.c
, and the other seems to come from sysdeps/x86_64/crti.S
. I'm talking about the csu
one.
Isn't _start
supposed to be the entry point set by the linker, and stored in the ELF header? What mechanism makes _init
run first? What is its purpose?
Tested on GCC 4.8, glibc 2.19, GDB 7.7.1 and Ubuntu 14.04.
The _start function is defined in the sysdeps/x86_64/start. S assembly file and does preparation like getting argc/argv from the stack, stack preparation and etc., before the __libc_start_main function will be called. The __libc_start_main function from the csu/libc-start.
The _start Function. For most C and C++ programs, the true entry point is not main , it's the _start function. This function initializes the program runtime and invokes the program's main function.
Both __libc_csu_init and call_init do basically the same thing: They run all constructors registered in the dynamic table entries INIT and INIT_ARRAY .
Where the debugger halts first in your example isn't the real beginning of the process.
In the ELF header there is an entry for the program interpreter (dynamic linker). On Linux 64 bit its value is /lib64/ld-linux-x86-64.so.2
. The kernel sets the initial instruction pointer to the entry point of this program interpreter. The symbol name of it is _start
too, like the programs _start
.
After the dynamic linker has done its work, calling also functions in the program, like _init
in glibc, it calls the entry point of the program.
The breakpoint at _start
doesn't work for the dynamic linker because it takes only the address of the program's _start
.
You can find the entry point address with readelf -h /lib64/ld-linux-x86-64.so.2
.
You could also set a breakpoint at _dl_start
and print a backtrace to see that this function is called from dynamic linker's _start
.
If you download glibc's current source code you can find the entry point of the dynamic loader at glibc-2.21/sysdeps/x86_64/dl-machine.h
starting on line 121.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With