For the following snippet of code,
int n;
char buf[100];
int fd = open ("/etc/passwd", O_RDONLY);
n = read ( fd, buf, 100);
How the compiler comes to know that read is a system call not any library function?
How it retrieves the system call number (__NR_read
)?
I very much doubt that the compiler knows it's a system call. It's far more likely that open
is in a library somewhere and the code within the library calls the relevant kernel interface.
The assembly output from the simple program:
#include <stdio.h>
int main (void) {
int fd = open("xyz");
return 0;
}
is (irrelevant bits removed):
main:
pushl %ebp ; stack frame setup.
movl %esp, %ebp
andl $-16, %esp
subl $32, %esp
movl $.LC0, (%esp) ; Store file name address.
call open ; call the library function.
movl %eax, 28(%esp) ; save returned file descriptor.
movl $0, %eax ; return 0 error code.
leave ; stack frame teardown.
ret
.LC0:
.string "xyz" ; file name to open.
The first thing you'll notice is that there's a call to open
. In other words, it's a function. There's not an int 80
or sysenter
in sight, which is the mechanism used for proper system calls (on my platform anyway - YMMV).
The wrapper functions in libc are where the actual work of accessing the system call interface is done.
An excerpt from Wikipedia on system calls:
Generally, systems provide a library that sits between normal programs and the operating system, usually an implementation of the C library (libc), such as glibc. This library exists between the OS and the application, and increases portability.
On exokernel based systems, the library is especially important as an intermediary. On exokernels, libraries shield user applications from the very low level kernel API, and provide abstractions and resource management.
The terms "system call" and "syscall" are often incorrectly used to refer to C standard library functions, particularly those that act as a wrapper to corresponding system calls with the same name. The call to the library function itself does not cause a switch to kernel mode (if the execution was not already in kernel mode) and is usually a normal subroutine call (i.e., using a "CALL" assembly instruction in some ISAs). The actual system call does transfer control to the kernel (and is more implementation-dependent than the library call abstracting it). For example,
fork
andexecve
are GLIBC functions that in turn call thefork
andexecve
system-calls.
And, after a bit of searching, the __open
function is found in glibc 2.9 in the io/open.c
file, and weakref
'ed to open
. If you execute:
nm /usr/lib/libc.a | egrep 'W __open$|W open$'
you can see them in there:
00000000 W __open
00000000 W open
read is a library call as far as the compiler is concerned. It just so happens that the libc implementation defines read to generate a software interrupt with the correct number.
The compiler can see the declaration of this function in <unistd.h>
, and it generates object code that makes a call to that function.
Try compiling with gcc -S
and you'll see something like:
movl $100, %edx
movq %rcx, %rsi
movl %eax, %edi
call read
The system call is made from the C library's implementation of read(2).
EDIT: specifically, GNU libc (which is likely what you have on Linux), establishes the relationships between syscall numbers and function names in glibc-2.12.1/sysdeps/syscalls.list
. Each line of that file is converted to an assembly language source code (based on sysdeps/unix/syscall-template.S
), compiled, and added to the library when libc is built.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With