Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between the roles of loader and C runtime initialization

Tags:

c

runtime

loader

I was reading about the roles of the C runtime initialization from this link: http://www.embecosm.com/appnotes/ean9/html/ch05s02.html

It says that the runtime intialization does tasks like setting up the stack and in further pages in detail it also says that it initializes the bss segment with zeroes. At some other places I also read that it initializes data and some other segments.

This created a doubt in my mind about what the loader does then? Because some of these tasks are also the responsibility of the loader.

So, my questions:

  1. What does the runtime initialization or c runtime actually do?
  2. What does loader actually do?

EDIT

Ok so if that link describes the role of runtime initialization for embedded systems specifically, then what role does it have on normal systems. As far as I think, then the runtime initialization will just call main and no other work is left for it.

like image 425
tapananand Avatar asked Dec 22 '14 05:12

tapananand


1 Answers

  1. What does the runtime initialization or c runtime actually do?

Wikipedia defines a runtime library as:

a set of low-level routines used by a compiler to invoke some of the behaviors of a runtime environment, by inserting calls to the runtime library into compiled executable binary.

In the case of C programs, the runtime library has very little to do outside of bootstrapping the program. The compiler invokes the C runtime to bootstrap various environmental things and then basically hands off control to the user by calling main.

Given the responses in the comments of your question, you may have already figured out that the process by which a program is bootstrapped for its environment varies with the number of targeted environments. Given the number of platforms and operating systems supported by C now and in the past, there is no possible way to enumerate all the ways in which the C runtime has worked or currently works.

Every C library has its own C runtime and every environment supporting C will likely have different bootstrap problems and requirements. These requirements depend largely on features of the operating system or hardware combined with the completeness of the C implementation. However, I can answer some things that C runtimes typically do in environments with which you may be familiar.

  • Since the C runtime is responsible for calling main, it follows that calling functions registered through atexit(3) would be the responsibility of the C runtime.

  • Resolve and call any constructor / destructor interfaces (_init, _fini, etc.)

  • Initialize and call the real-time loader (which is responsible for resolving and loading dynamic shared objects registered at link-time and loaded at runtime).

  • Handling exit of detached threads gracefully.

  • Initialization and passing of argc and argv into the program's main.

  • Define and initialize various C library global symbols. For example, it sets errno properly for the environment (modern systems define errno to be thread-safe, so it needs to live in TLS). environ is another global symbol that needs initialization prior to calling main.

  • For that matter, the C runtime needs to set up TLS.

  • Tons more.

You may be interested in looking through the glibc implementation of the runtime, found in the "csu" (C start-up) directory. (There are some machine-specific portions outside of this directory.)

Different systems will have different requirements. As you've read, an embedded system may have significantly more work for the runtime because they may be responsible for tasks ranging from register initialization to program loading and execution (where this is not provided by any kernel). The distinction between "C runtime" and "kernel" can become blurry given sufficiently complex standalone projects on embedded targets.

Now:

  1. What does [a] loader actually do?

There are many types of loaders, also depending on the runtime environment. For a small embedded environment with an EEPROM, the loader may be some firmware that starts execution of whatever it finds at address 0. You might also think of yourself as the loader, manually writing your binary out to the EEPROM.

In modern operating systems, there are a number of loaders.

  1. Bootloaders. Historically, these have operated in a fashion where the BIOS picks a boot device, looks at an address, reads 512 bytes of data into memory, and starts executing from there. I've been out of this world for a while now, so I'm not sure what the difference is with EFI/UEFI other than that they are sufficiently more complete (and complicated) bootstrap environments.

  2. Kernels. When you execute a program, tons of stuff is going on under the hood to get it going. Assuming you're running your program from a shell in some Unix-like OS, the loading process may follow something like this:

    • Your shell attempts to look for the binary somewhere in your environment-configured PATH. This is done by issuing a number of syscalls to the kernel to resolve a filename under a different path sequence.
    • Assuming the file is found, the shell will usually fork(2) and execve(2). The fork(2) call causes the kernel to create a new process; the execve(2) call replaces the cloned binary with the new one.
    • The kernel reads the first page of the file from its storage medium (disk, network, memory, whatever) and attempts to figure out how to execute it.
      • If it's an ELF binary, it can determine that from the binary's header. The kernel then loads the sections of the binary into memory somewhere, based on offsets specified in the ELF section headers, sets up mapped regions for stacks and whatnot, then begins executing based on the entry address (also part of the ELF header). This entry point is probably _start, part of the C runtime.
      • If it's not an ELF binary, it could still be executable through an interpreter. The kernel will attempt to parse the interpreter from the start of the file (e.g. #!/bin/bash), resolve it, and execute it. Eventually it will find an ELF executable or it will fail.
    • The kernel begins executing the binary, probably at _start, as mentioned.
    • Eli Bendersky has a more thorough write-up on this, entitled "How statically linked programs run on linux".
  3. Run-time loaders / dynamic linkers / whatever you want to call them. I'll refer you to the "Anatomy of Linux dynamic libraries" article for information about how these work. Of course, the dlopen(3) / dlsym(3) / dlclose(3) / dlerror(3) set of functions are simply an API for interacting with the dynamic loader. I would highly recommend reading the manual pages of these interfaces to get a good idea of the featureset supported by the Linux dynamic loader, and what kinds of things the loader does.

like image 168
dho Avatar answered Nov 08 '22 00:11

dho