How does kernel get an executable binary file running under linux?
It seems a simple question, but anyone can help me dig deep? How the file is loaded to memory and how execution code get started?
Can anyone help me and tell what's happening step by step?
The executable is invoked by filename with execve(). The kernel loads the executable into the process, and looks for a PT_INTERP entry in its ELF Program Headers; this specifies the filename of the dynamic linker (/lib/ld-linux. so.
Binary Programs in Linux A binary file can be an executable as well as a non-executable file. Examples of non-executable binary files are rich-text documents, audio and video files, compressed files, graphics files, spreadsheet files, and so on. A binary program is a binary file that is executable.
A binary executable file is a file in a machine language for a specific processor. Binary executable files contain executable code that is represented in specific processor instructions. These instructions are executed by a processor directly. A binary file, however, can have text strings (ASCII and/or Unicode).
Best moments of the exec
system call on Linux 4.0
The best way to find all of that out is to GDB step debug the kernel with QEMU: How to debug the Linux kernel with GDB and QEMU?
fs/exec.c
defines the system call at SYSCALL_DEFINE3(execve
Simply forwards to do_execve
.
do_execve
Forwards to do_execveat_common
.
do_execveat_common
To find the next major function, track when return value retval
is last modified.
Starts building a struct linux_binprm *bprm
to describe the program, and passes it to exec_binprm
to execute.
exec_binprm
Once again, follow the return value to find the next major call.
search_binary_handler
Handlers are determined by the first magic bytes of the executable.
The two most common handlers are those for interpreted files (#!
magic) and for ELF (\x7fELF
magic), but there are other built-into the kernel, e.g. a.out
. And users can also register their own though /proc/sys/fs/binfmt_misc
The ELF handler is defined at fs/binfmt_elf.c
.
See also: Why do people write the #!/usr/bin/env python shebang on the first line of a Python script?
The formats
list contains all the handlers.
Each handler file contains something like:
static int __init init_elf_binfmt(void) { register_binfmt(&elf_format); return 0; }
and elf_format
is a struct linux_binfmt
defined in that file.
__init
is magic and puts that code into a magic section that gets called when the kernel starts: What does __init mean in the Linux kernel code?
Linker-level dependency injection!
There is also a recursion counter, in case an interpreter executes itself infinitely.
Try this:
echo '#!/tmp/a' > /tmp/a chmod +x /tmp/a /tmp/a
Once again we follow the return value to see what comes next, and see that it comes from:
retval = fmt->load_binary(bprm);
where load_binary
is defined for each handler on the struct: C-style polymorsphism.
fs/binfmt_elf.c:load_binary
Does the actual work:
struct pt_regs
start_thread
, which marks the process as available to get to be scheduled by the schedulereventually the scheduler decides to run the process, and it must then jump to the PC address stored in struct pt_regs
while also moving to a less privileged CPU state such as Ring 3 / EL0: What are Ring 0 and Ring 3 in the context of operating systems?
The scheduler gets woken up periodically by a clock hardware that generates interrupts periodically as configured earlier by the kernel, for example the old x86 PIT or the ARM timer. The kernel also registers handlers which run the scheduler code when the timer interrupts are fired.
TODO: continue source analysis further. What I expect to happen next:
/lib64/ld-linux-x86-64.so.2
).dlopen
on themdlopen
uses a configurable search path to find those libraries (ldd
and friends), mmap them to memory, and somehow inform the ELF where to find its missing symbols_start
of the ELFotherwise, the kernel loads the executable into memory directly without the dynamic loader.
It must therefore in particular check if the executable is PIE or not an if it is place it in memory at a random location: What is the -fPIE option for position-independent executables in gcc and ld?
Two system calls from the linux kernel are relevant. The fork system call (or perhaps vfork
or clone
) is used to create a new process, similar to the calling one (every Linux user-land process except init
is created by fork
or friends). The execve system call replace the process address space by a fresh one (essentially by sort-of mmap-ing segments from the ELF executable and anonymous segments, then initializing the registers, including the stack pointer). The x86-64 ABI supplement and the Linux assembly howto give details.
The dynamic linking happens after execve
and involves the /lib/x86_64-linux-gnu/ld-2.13.so
file, which for ELF is viewed as an "interpreter".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With