My understanding of system calls is that in Linux the system call mechanism (int 0x80
or whatever) is documented and guaranteed to be stable across different kernel versions. Using this information, the system calls are implemented directly in the CRT library, so that when I call e.g. printf("a");
this involves a single function call to the CRT, where the system call is set up and activated. In theory this can be improved further by statically compiling the CRT (not common on Linux, but a possibility) so that even the single function call may be inlined.
On the other hand, Windows does not document or even guarantee consistency of the system call mechanism. The only way to make a system call on Windows is to call into ntdll.dll
(or maybe some other *.dll
) which is done from the CRT, so there are two function calls involved. If the CRT is used statically and the function gets inlined (slightly more common on Windows than Linux) we still have the single function call into ntdll.dll
that we can't get rid of.
So it seems to me that theoretically system calls on Windows will be inherently slower since they always have to do one function call more than their Linux equivalents. Is this understanding (and my explanation above) true?
Note: I am asking this purely theoretically. I understand that when doing a system call (which I think always involves 2 context switches - one in each direction) the cost of an extra function call is probably completely negligible.
Slow system calls are those that wait for an indefinite stretch of time for something to finish (e.g. waitpid), for something to become available (e.g. read from a client socket that's not seen any data recently), or for some external event (e.g. network connection request from client via accept.)
Second, Linux provides several hundred system calls. Some of them are written with C but some of them have to be implemented with assembly language. Not only programs written with C can make system call, most of other language such as Ruby, Golang can also do that.
System calls are usually implemented in C (or at least that used to be the case) mixed with assembly language. That said, everything gets translated eventually to machine code.
On IA-32 there are two ways to make a system call:
Pure int/iret based system call takes 211 CPU cycles (and even much more on modern processors). Sysenter/sysexit takes 46 CPU ticks. As you can see execution of only a pair of instructions used for system call introduces significant overhead. But any system call implementation involves some work on the kernel side (setup of kernel context, dispatching of the call and its arguments etc.). More or less realistic highly optimized system call will take ~250 and ~100 CPU cycles for int/iret and sysenter/sysexit based system calls respectively. In Linux and Windows it will take ~500 ticks.
In the same time, function call (based on call/ret) have a cost of 2-4 tics + 1 for each argument.
As you can see, overhead introduced by function call is negligible in comparision to the system call cost.
On other hand, if you embed raw system calls in your application, you will make it highly hardware dependent. For example, what if your application with embedded sysenter/sysexit based raw system call will be executed on old PC without these instructions support? In addition your application will be sensitive for system call call convention used by OS.
Such libraries like ntdll.dll and glibc are commonly used, because they provide well-known and hardware independent interface for the system services and hides details of the communication with kernel behind the scene.
Linux and Windows have approximately the same cost of system calls if use the same way of crossing the user/kernel space border (difference will be negligible). Both trying to use fastest way possible on each particular machine. All modern Windows versions starting at least from Windows XP are prepared for sysenter/sysexit. Some old and/or specific versions of Linux can still use int/iret based calls. x64 versions of OSes relies to syscall/sysret instructions which works like the sysenter/sysexit and available as part of AMD64 instructions set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With