The symbol _start
is the entry point of your program. That is, the address of that symbol is the address jumped to on program start. Normally, the function with the name _start
is supplied by a file called crt0.o
which contains the startup code for the C runtime environment. It sets up some stuff, populates the argument array argv
, counts how many arguments are there, and then calls main
. After main
returns, exit
is called.
If a program does not want to use the C runtime environment, it needs to supply its own code for _start
. For instance, the reference implementation of the Go programming language does so because they need a non-standard threading model which requires some magic with the stack. It's also useful to supply your own _start
when you want to write really tiny programs or programs that do unconventional things.
While main
is the entry point for your program from a programmers perspective, _start
is the usual entry point from the OS perspective (the first instruction that is executed after your program was started from the OS)
In a typical C and especially C++ program, a lot of work has been done before the execution enters main. Especially stuff like initialization of global variables. Here you can find a good explanation of everything that's going on between _start()
and main()
and also after main has exited again (see comment below).
The necessary code for that is usually provided by the compiler writers in a startup file, but with the flag –nostartfiles
you essentially tell the compiler: "Don't bother giving me the standard startup file, give me full control over what is happening right from the start".
This is sometimes necessary and often used on embedded systems. E.g. if you don't have an OS and you have to manually enable certain parts of your memory system (e.g. caches) before the initialization of your global objects.
Here is a good overview of what happens during program startup before main
. In particular, it shows that __start
is the actual entry point to your program from OS viewpoint.
It is the very first address from which the instruction pointer will start counting in your program.
The code there invokes some C runtime library routines just to do some housekeeping, then call your main
, and then bring things down and call exit
with whatever exit code main
returned.
A picture is worth a thousand words:
P.S: this answer is transplanted from another question which SO has helpfully closed as duplicate of this one.
When would one need to do this kind of thing?
When you want your own startup code for your program.
main
is not the first entry for a C program, _start
is the first entry behind the curtain.
Example in Linux:
_start: # _start is the entry point known to the linker
xor %ebp, %ebp # effectively RBP := 0, mark the end of stack frames
mov (%rsp), %edi # get argc from the stack (implicitly zero-extended to 64-bit)
lea 8(%rsp), %rsi # take the address of argv from the stack
lea 16(%rsp,%rdi,8), %rdx # take the address of envp from the stack
xor %eax, %eax # per ABI and compatibility with icc
call main # %edi, %rsi, %rdx are the three args (of which first two are C standard) to main
mov %eax, %edi # transfer the return of main to the first argument of _exit
xor %eax, %eax # per ABI and compatibility with icc
call _exit # terminate the program
Is there any real world scenario where this would be useful?
If you mean, implement our own _start
:
Yes, in most of the commercial embedded software I have worked with, we need to implement our own _start
regarding to our specific memory and performance requirements.
If you mean, drop the main
function and change it to something else:
No, I don't see any benefit doing that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With