Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is the system call in Linux implemented?

When I invoke a system call in user mode,how did the call get processed in OS?

Does it invoke some some executable binary or some standard library?

If yes,what kind of thing it needs to complete the call?

like image 804
MainID Avatar asked Jan 31 '09 17:01

MainID


People also ask

What is system call how do you implement it?

A system call is a way for programs to interact with the operating system. A computer program makes a system call when it makes a request to the operating system's kernel. System call provides the services of the operating system to the user programs via Application Program Interface(API).

What are systems calls in the Linux operating system?

A system call is a programmatic way a program requests a service from the kernel, and strace is a powerful tool that allows you to trace the thin layer between user processes and the Linux kernel.


2 Answers

Have a look at this.

Starting with version 2.5, linux kernel introduced a new system call entry mechanism on Pentium II+ processors. Due to performance issues on Pentium IV processors with existing software interrupt method, an alternative system call entry mechanism was implemented using SYSENTER/SYSEXIT instructions available on Pentium II+ processors. This article explores this new mechanism. Discussion is limited to x86 architecture and all source code listings are based on linux kernel 2.6.15.6.

  1. What are system calls?

    System calls provide userland processes a way to request services from the kernel. What kind of services? Services which are managed by operating system like storage, memory, network, process management etc. For example if a user process wants to read a file, it will have to make 'open' and 'read' system calls. Generally system calls are not called by processes directly. C library provides an interface to all system calls.

  2. What happens in a system call?

    A kernel code snippet is run on request of a user process. This code runs in ring 0 (with current privilege level -CPL- 0), which is the highest level of privilege in x86 architecture. All user processes run in ring 3 (CPL 3).

    So, to implement system call mechanism, what we need is

    1) a way to call ring 0 code from ring 3.

    2) some kernel code to service the request.

  3. Good old way of doing it

    Until some time back, linux used to implement system calls on all x86 platforms using software interrupts. To execute a system call, user process will copy desired system call number to %eax and will execute 'int 0x80'. This will generate interrupt 0x80 and an interrupt service routine will be called. For interrupt 0x80, this routine is an "all system calls handling" routine. This routine will execute in ring 0. This routine, as defined in the file /usr/src/linux/arch/i386/kernel/entry.S, will save the current state and call appropriate system call handler based on the value in %eax.

  4. New shiny way of doing it

    It was found out that this software interrupt method was much slower on Pentium IV processors. To solve this issue, Linus implemented an alternative system call mechanism to take advantage of SYSENTER/SYSEXIT instructions provided by all Pentium II+ processors. Before going further with this new way of doing it, let's make ourselves more familiar with these instructions.

like image 58
GregD Avatar answered Oct 08 '22 04:10

GregD


It depends on what you mean by system call. Do you mean a C library call (through glibc) or an actual system call? C library calls always end up using system calls in the end.

The old way of doing system calls was through a software interrupt, i.e., the int instruction. Windows had int 0x2e while Linux had int 0x80. The OS sets up an interrupt handler for 0x2e or 0x80 in the Interrupt Descriptor Table (IDT). This handler then performs the system call. It copies the arguments from user-mode to kernel-mode (this is controlled by an OS-specific convention). On Linux, the arguments are passed using ebx, ecx, edx, esi, and edi. On Windows, the arguments are copied from the stack. The handler then performs some sort of lookup (to find the address of the function) and executes the system call. After the system call is completed, the iret instruction returns to user-mode.

The new way is sysenter and sysexit. These two instructions basically do all the register work for you. The OS sets the instructions up through the Model Specific Registers (MSRs). After that it's practically the same as using int.

like image 38
wj32 Avatar answered Oct 08 '22 05:10

wj32