I am wondering if it is legal to return with ret
from a program's entry point.
Example with NASM:
section .text
global _start
_start:
ret
; Linux: nasm -f elf64 foo.asm -o foo.o && ld foo.o
; OS X: nasm -f macho64 foo.asm -o foo.o && ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
ret
pops a return address from the stack and jumps to it.
But are the top bytes of the stack a valid return address at the program entry point, or do I have to call exit?
Also, the program above does not segfault on OS X. Where does it return to?
When you are using MacOS and link with:
ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
you are getting a dynamically loaded version of your code. _start
isn't the true entry point, the dynamic loader is. The dynamic loader as one of its last steps does C/C++/Objective-C runtime initialization, and then calls your specified entry point specified with the -e
option. The Apple documentation about Forking and Executing the Process has these paragraphs:
A Mach-O executable file contains a header consisting of a set of load commands. For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program. If you use Xcode, this is always /usr/lib/dyld, the standard OS X dynamic linker.
When you call the execve routine, the kernel first loads the specified program file and examines the mach_header structure at the start of the file. The kernel verifies that the file appear to be a valid Mach-O file and interprets the load commands stored in the header. The kernel then loads the dynamic linker specified by the load commands into memory and executes the dynamic linker on the program file.
The dynamic linker loads all the shared libraries that the main program links against (the dependent libraries) and binds enough of the symbols to start the program. It then calls the entry point function. At build time, the static linker adds the standard entry point function to the main executable file from the object file /usr/lib/crt1.o. This function sets up the runtime environment state for the kernel and calls static initializers for C++ objects, initializes the Objective-C runtime, and then calls the program’s main function
In your case that is _start
. In this environment where you are creating a dynamically linked executable you can do a ret
and have it return back to the code that called _start
which does an exit system call for you. This is why it doesn't crash. If you review the generated object file with gobjdump -Dx foo
you should get:
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000001 0000000000001fff 0000000000001fff 00000fff 2**0
CONTENTS, ALLOC, LOAD, CODE
SYMBOL TABLE:
0000000000001000 g 03 ABS 01 0010 __mh_execute_header
0000000000001fff g 0f SECT 01 0000 [.text] _start
0000000000000000 g 01 UND 00 0100 dyld_stub_binder
Disassembly of section .text:
0000000000001fff <_start>:
1fff: c3 retq
Notice that start address
is 0. And the code at 0 is dyld_stub_binder
. This is the dynamic loader stub that eventually sets up a C runtime environment and then calls your entry point _start
. If you don't override the entry point it defaults to main
.
If however you build as a static executable, there is no code executed before your entry point and ret
should crash since there is no valid return address on the stack. In the documentation quoted above is this:
For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program.
A statically built executable doesn't use the dynamic loader dyld
with crt1.o
embedded in it. CRT = C runtime library which covers C++/Objective-C as well on MacOS. The processes of dealing with dynamic loading are not done, C/C++/Objective-C initialization code is not executed, and control is transferred directly to your entry point.
To build statically drop the -lc
(or -lSystem
) from the linker command and add -static
option:
ld foo.o -macosx_version_min 10.12.0 -e _start -o foo -static
If you run this version it should produce a segmentation fault. gobjdump -Dx foo
produces
start address 0x0000000000001fff
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000001 0000000000001fff 0000000000001fff 00000fff 2**0
CONTENTS, ALLOC, LOAD, CODE
1 LC_THREAD.x86_THREAD_STATE64.0 000000a8 0000000000000000 0000000000000000 00000198 2**0
CONTENTS
SYMBOL TABLE:
0000000000001000 g 03 ABS 01 0010 __mh_execute_header
0000000000001fff g 0f SECT 01 0000 [.text] _start
Disassembly of section .text:
0000000000001fff <_start>:
1fff: c3 retq
You should notice start_address
is now 0x1fff. 0x1fff is the entry point you specified (_start
). There is no dynamic loader stub as an intermediary.
Under Linux when you specify your own entry point it will segmentation fault whether you are building as a static or shared executable. There is good information on how ELF executables are run on Linux in this article and the dynamic linker documentation. The key point that should be observed is that the Linux one makes no mention of doing C/C++/Objective-C runtime initialisation unlike the MacOS dynamic linker documentation.
The key difference between the Linux dynamic loader (ld.so) and the MacOS one (dynld) is that the MacOS dynamic loader performs C/C++/Objective-C startup initialization by including the entry point from crt1.o
. The code in crt1.o
then transfers control to the entry point you specified with -e
(default is main
). In Linux the dynamic loader makes no assumption about the type of code that will be run. After the shared objects are processed and initialized control is transferred directly to the entry point.
FreeBSD (on which MacOS is based) and Linux share one thing in common. When loading 64-bit executables the layout of the user stack when a process is created is the same. The stack for 32-bit processes is similar but pointers and data are 4 bytes wide, not 8.
Although there isn't a return address on the stack, there is other data representing the number of arguments, the arguments, environment variables, and other information. This layout is not the same as what the main
function in C/C++ expects. It is part of the C startup code to convert the stack at process creation to something compatible with the C calling convention and the expectations of the function main
(argc
, argv
, envp
).
I wrote more information on this subject in this Stackoverflow answer that shows how a statically linked MacOS executable can traverse through the program arguments passed by the kernel at process creation.
Suplementing what Michael Petch already answered:
from runnable Mach-o executable perspective program launch happens either due to load command LC_MAIN
(most modern executables since 10.7) which utilises DYLD
in the process or backward compatible load command LC_UNIXTHREAD
. The former is the variant where your ret
is allowed and in fact preferable because you return control to DYLD __mh_execute_header. This will be followed by a buffer flush.
Alternatively to ret
you can use system exit call either through undocumented syscall
kernel API (64bit, int 0x80
for 32bit) or DYLD wrapper C lib doing it(documented).
If your executable is not utilising LC_MAIN
you're left with legacy LC_UNIXTHREAD
where you have no alternative to system exit call , ret
will cause a segmentation fault
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With