Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linux System Calls & Kernel Mode

I understand that system calls exist to provide access to capabilities that are disallowed in user space, such as accessing a HDD using the read() system call. I also understand that these are abstracted by a user-mode layer in the form of library calls such as fread(), to provide compatibility across hardware.

So from the application developers point of view, we have something like;

//library    //syscall   //k_driver    //device_driver
fread()  ->  read()  ->  k_read()  ->  d_read()

My question is; what is stopping me inlining all the instructions in the fread() and read() functions directly into my program? The instructions are the same, so the CPU should behave in the same way? I have not tried it, but I assume that this does not work for some reason I am missing. Otherwise any application could get arbitrary kernel mode operation.

TL;DR: What allows system calls to 'enter' kernel mode that is not copy-able by an application?

like image 289
lynks Avatar asked May 24 '13 10:05

lynks


1 Answers

System calls do not enter the kernel themselves. More precisely, for example the read function you call is still, as far as your application is concerned, a library call. What read(2) does internally is calling the actual system call using some interruption or the syscall(2) assembly instruction, depending on the CPU architecture and OS.

This is the only way for userland code to have privileged code to be executed, but it is an indirect way. The userland and kernel code execute in different contexts.

That means you cannot add the kernel source code to your userland code and expect it to do anything useful but crash. In particular, the kernel code has access to physical memory addresses required to interact with the hardware. Userland code is limited to access a virtual memory space that has not this capability. Also, the instructions userland code is allowed to execute is a subset of the ones the CPU support. Several I/O, interruption and virtualization related instructions are examples of prohibited code. They are known as privileged instructions and require to be in an lower ring or supervisor mode depending on the CPU architecture.

like image 56
jlliagre Avatar answered Sep 22 '22 13:09

jlliagre