Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is __libc_start_main and _start?

Tags:

c

linux

gcc

gdb

elf

From the past few days I have been trying to understand what happens behind the curtain when we execute a C program. However even after reading numerous posts I cannot find a detailed and accurate explanation for the same. Can someone please help me out ?

like image 460
perdix Avatar asked Mar 02 '23 07:03

perdix


1 Answers

You would usually find special names like this for specific uses when compiling and linking programs.

Keeping in mind that this answer is of a general nature rather than a specific implementation of starting up a C environment, you would typically have something like a _start label, which would be the actual entry point for an executable (from the hosting environment's point of view).

This would be located in some object file or library (like crt0.o for the C runtime start-up code) and would normally be added automagically to your executable file by the linker, similar to the way the C runtime library is added(a).

The operating system code for starting a program would then be akin to (pseudo-code, obviously, and with much less error checking than it should have):

def spawnProg(progName):
    id = newProcess()                       # make process space
    loadProgram(pid = id, file = progName)  # load program into it
    newThread(pid, initialPc = '_start')    # make thread to run it

Even though you yourself create a main when coding in C, that's not really where things start happening. There's a whole slew of things that need to be done even before your main program starts. Hence the content of the C start-up code would be along the lines of (at its most simplistic):

_start:  ;; Weave magic here to set up C and libc.
    ;; Note this is example code for a mythical implementation,
    ;; intended to show how it could work. It is not specific
    ;; bound to any given implementation.
    call __setup_for_c       ; Set up C environment.
    call __libc_start_main   ; Set up standard library.
    call _main               ; Call your main.
    call __libc_stop_main    ; Tear down standard library.
    call __teardown_for_c    ; Tear down C environment.
    jmp  __exit              ; Return to OS.

The "weaving of magic" is whatever it takes to make the environment ready for a C program. This may include things like:

  • setting up static data (this is supposed to be initialised to zeros so it's probably just an allocation of a chunk of of memory, which is then zeroed by the start-up code - otherwise you would need to store a chunk of that size, already zeroed, in the executable file);
  • preparing argc and argv on the stack, and even preparing the stack itself (there are specific calling conventions that may be used for C, and it's likely the operating system doesn't necessarily set up the stack at all when calling _start since the needs of the process are not known);
  • setting up thread-specific data structures (things like random number generators, or error variables, per thread);
  • initialising the C library in other ways; and so on.

Only once all that is complete will it be okay to call your main function. There's also the likelihood that work needs to be done after your main exits, such as:

  • invoking atexit handlers (things you want run automatically on exit, no matter where the exit occurs);
  • detaching from shared resources (for example, shared memory if the OS doesn't do this automatically when it shuts down a process); and
  • freeing up any other resources not automatically cleaned when the process exits, that would otherwise hang around.

(a) Many linkers can be told to not do that if, for example, you're writing something that doesn't use the standard C library, or if you want to provide your own _start routine for low-level work.

like image 52
paxdiablo Avatar answered Mar 05 '23 15:03

paxdiablo