Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance difference between system call vs function call

I quite often listen to driver developers saying its good to avoid kernel mode switches as much as possible. I couldn't understand the precise reason. To start with my understanding is -

  1. System calls are software interrupts. On x86 they are triggered by using instruction sysenter. Which actually looks like a branch instruction which takes the target from a machine specific register.
  2. System calls don't really have to change the address space or process context.
  3. Though, they do save registers on process stack and and change stack pointer to kernel stack.

Among these operations syscall pretty much works like a normal function call. Though the sysenter could behave like a mis-predicted branch which could lead to ROB flush in processor pipeline. Even that is not really bad, its just like any other mis-predicted branch.

I heard a few people answering on Stack Overflow:

  1. You never know how long syscall takes - [me] yeah, but thats case with any function. Amount of time it takes depends on the function
  2. It is often scheduling spot. - [me] process can get rescheduled, even if it is running all the time in user mode. ex, while(1); doesnt guarantee a no-context switch.

Where is the actual syscall cost coming from?

like image 562
APKar Avatar asked Jun 23 '12 13:06

APKar


People also ask

What is the difference between system call and function call?

A system call differs from a user function in several key ways. A system call has more privilege than a normal subroutine. A system call runs with kernel-mode privilege in the kernel protection domain. System call code and data are located in global kernel memory.

Are system calls faster than library functions?

In library call, the mode is executed in user mode only. 4. In system call the execution process speed is slower than the library call because there is a mode of transition called context switching. In library call the execution process speed is faster than the system call because there is no mode of context switching.

Why are system calls slow?

Slow system calls are those that wait for an indefinite stretch of time for something to finish (e.g. waitpid), for something to become available (e.g. read from a client socket that's not seen any data recently), or for some external event (e.g. network connection request from client via accept.)

Why is calling a program function faster than issuing a system call in modern OSS?

The interrupt causes the kernel to take over and perform the requested action, then hands control back to the application. This mode switching is the reason that system calls are slower to execute than an equivalent application-level function.


2 Answers

You don't indicate what OS you are asking about. Let me attempt an answer anyway.

The CPU instructions syscall and sysenter should not be confused with the concept of a system call and its representation in the respective OSs.

The best explanation for the difference in the overhead incurred by each respective instruction is given by reading through the Operation sections of the Intel® 64 and IA-32 Architectures Developer's Manual volume 2A (for int, see page 3-392) and volume 2B (for sysenter see page 4-463). Also don't forget to glance at iretd and sysexit while at it.

A casual counting of the pseudo-code for the operations yields:

  • 408 lines for int
  • 55 lines for sysenter

Note: Although the existing answer is right in that sysenter and syscall are not interrupts or in any way related to interrupts, older kernels in the Linux and the Windows world used interrupts to implement their system call mechanism. On Linux this used to be int 0x80 and on Windows int 0x2E. And consequently on those kernel versions the IDT had to be primed to provide an interrupt handler for the respective interrupt. On newer systems, that's true, the sysenter and syscall instructions have completely replaced the old ways. With sysenter it's the MSR (machine specific register) 0x176 which gets primed with the address of the handler for sysenter (see the reading material linked below).


On Windows ...

A system call on Windows, just like on Linux, results in the switch to kernel mode. The scheduler of NT doesn't provide any guarantees about the time a thread is granted. Also it yanks away time from threads and can even end up starving threads. In general one can say that user mode code can be preempted by kernel mode code (with very few very specific exceptions to which you'll certainly get in the "advanced driver writing class"). This makes perfect sense if we only look at one example. User mode code can be swapped out - or, for that matter, the data it's trying to access. Now the CPU doesn't have the slightest clue how to access pages in the swap/paging file, so an intermediate step is required. And that's also why kernel mode code must be able to preempt user mode code. It is also the reason for one of the most prolific bug-check codes seen on Windows and mostly caused by third-party drivers: IRQL_NOT_LESS_OR_EQUAL. It means that a driver accessed paged memory when it wasn't possible to preempt the code touching that memory.


Further reading

  1. SYSENTER and SYSEXIT in Windows by Geoff Chappell (always worth a read in my experience!)
  2. Sysenter Based System Call Mechanism in Linux 2.6
  3. Windows NT platform specific discussion: How Do Windows NT System Calls REALLY Work?
  4. Windows NT platform specific discussion: System Call Optimization with the SYSENTER Instruction
  5. Windows Internals, 5th ed., by Russinovich et. al. - pages 125 through 132.
  6. ReactOS implementation of KiFastSystemCall
like image 137
0xC0000022L Avatar answered Oct 25 '22 19:10

0xC0000022L


SYSENTER/SYSCALL is not a software interrupt; whole point of those instructions is to avoid overhead caused by issuing IRQ and calling interrupt handler.

Saving registers on stack costs time, this is one place where the syscall cost comes from.

Another place comes from the kernel mode switch itself. It involves changing segment registers - CS, DS, ES, FS, GS, they all have to be changed (it's less costly on x86-64, as segmentation is mostly unused, but you still need to essentially make far jump to kernel code) and also changes CPU ring of execution.

To conclude: function call is (on modern systems, where segmentation is not used) near call, while syscall involves far call and ring switch.

like image 23
Griwes Avatar answered Oct 25 '22 20:10

Griwes