I've looked through the links What is the difference between exit and return? and return statement vs exit() in main() to find the answer, but in vain.
Problem with the first link is that the answer assumes return
from any function. I want to know the exact difference between the two when in main() function. Even if there's a little difference I'd like to know what it is. Which is preferred and why? Is there any performance gain in using return
over exit() (or exit() over return
) with all sorts of compiler optimizations turned off?
Problem with the second link is I'm not interested in knowing what happens in C++. I want the answer specifically pertaining to C.
EDIT: After recommendation by a person, I actually tried to compare the assembly output of the following programs:
Note: Using gcc -S <myprogram>.c
Program mainf.c:
int main(void){
return 0;
}
Assembly output:
.file "mainf.c"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.9.2-10ubuntu13) 4.9.2"
.section .note.GNU-stack,"",@progbits
Program mainf1.c:
#include <stdlib.h>
int main(void){
exit(0);
}
Assembly output:
.file "mainf1.c"
.text
.globl main
.type main, @function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %edi
call exit
.cfi_endproc
.LFE2:
.size main, .-main
.ident "GCC: (Ubuntu 4.9.2-10ubuntu13) 4.9.2"
.section .note.GNU-stack,"",@progbits
Noting that I'm not well versed with assembly, I can see some differences between the 2 programs with the exit()
version being shorter than return
version. What's the difference?
There is practically no difference between calling exit
or executing return
from main
as long as main
returns a type that is compatible with int
.
From the C11 Standard:
5.1.2.2.3 Program termination
1 If the return type of the
main
function is a type compatible withint
, a return from the initial call to themain
function is equivalent to calling theexit
function with the value returned by themain
function as its argument; reaching the}
that terminates themain
function returns a value of 0. If the return type is not compatible withint
, the termination status returned to the host environment is unspecified.
Disclaimer: This answer does not quote the C Standards.
Both the methods jump into GLibC code, and to know exactly what that code is doing or which one is faster or more efficient, you'll need to read them. If you want to know more about the GLibC, you should check the sources for the GCC and GLibC. There are links in the end for those.
First: there's a difference between exit(3) and _exit(2). The first is a GLibC wrapper around the second, which is a system call. The one we use in our program, and requires the inclusion of stdlib.h
is exit(3)
- the GLibC wrapper, not the system call.
Now, programs are not just your simple instructions. They contain heavy loads of GLibC's own instructions. These GLibC functions serve several purposes related to loading and providing the library functionality you use. For that to work GLibC must be "inside" your program.
So, how is GLibC inside your program? Well, it puts itself there through your compiler (it sets some static code and some hooks into the dynamic library) - most likely you're using gcc.
I suppose you know what stack frames are, so I won't explain what they are. The cool thing to notice is that main()
itself has it's own stack frame. And that stack frame returns somewhere and it must return... But, to where?
Lets compile the following:
int main(void)
{
return 0;
}
And compile and debug it with:
$ gcc -o main main.c
$ gdb main
(gdb) disass main
Dump of assembler code for function main:
0x00000000004005e8 <+0>: push %rbp
0x00000000004005e9 <+1>: mov %rsp,%rbp
0x00000000004005ec <+4>: mov $0x0,%eax
0x00000000004005f1 <+9>: pop %rbp
0x00000000004005f2 <+10>: retq
End of assembler dump.
(gdb) break main
(gdb) run
Breakpoint 1, 0x00000000004005ec in main ()
(gdb) stepi
...
Now, stepi
will make for the fun part. This will jump one instruction at a time, so it's perfect to follow function calls. After you press run stepi
for the first time, just hold your finger on ENTER until you get tired.
What you must observe is the sequence in which functions are called with this method. You see, ret
is a "jumping" instruction (edit: after David Hoelzer comment, I see that calling ret
a simple jump is an over-generalization): after we pop rbp
, ret
itself will pop the return pointer from the stack and jump to it. So, if GLibC built that stack frame, retq
is making our return 0;
C statement jump right into GLibC's own code! How clever!
The order of function calls I got started roughly like this:
__libc_start_main
exit
__run_exit_handlers
_dl_fini
rtld_lock_default_lock_recursive
_dl_fini
_dl_sort_fini
Compiling this:
#include <stdlib.h>
int main(void)
{
exit(0);
}
And compiling and debugging...
$ gcc -o exit exit.c
$ gdb exit
(gdb) disass main
Dump of assembler code for function main:
0x0000000000400628 <+0>: push %rbp
0x0000000000400629 <+1>: mov %rsp,%rbp
0x000000000040062c <+4>: mov $0x0,%edi
0x0000000000400631 <+9>: callq 0x4004d0 <exit@plt>
End of assembler dump.
(gdb) break main
(gdb) run
Breakpoint 1, 0x000000000040062c in main ()
(gdb) stepi
...
And the function sequence I got was:
exit@plt
??
_dl_runtime_resolve
_dl_fixup
_dl_lookup_symbol_x
do_lookup_x
check_match
_dl_name_match
strcmp
There's a cool tool for printing the symbols defined within a binary. It's nm. I suggest you take a look into it as it will give you an idea of how much "crap" it's added in a simple program like the ones above.
To use it in the simplest form:
$ nm main
$ nm exit
That will print a list of symbols in the file. Note that this list does not include references these functions will make. So if a given function in this list calls another function, the other probably won't be in the list.
It depends heavily on the way the GLibC choses to handle a simple stack frame return from main
and how it implements the exit
wrapper. In the end, the _exit(2)
system call will get called and you'll exit your process.
Finally, to really answer your question: both the methods jump into GLibC code, and to know exactly what that code is doing you'll need to read it. If you want to know more about the GLibC, you should check the sources for the GCC and GLibC.
stdlib/exit.c
and stdlib/exit.h
for the implementations.kernel/exit.c
for the _exit(2)
system call implementation, and include/syscalls.h
for the preprocessor magic behind it.gcc
(compiler, not suite) sources, and would appreciate if anyone could point out where the runtime sequence is defined.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With