First consider the situation when there is only one operating system installed. Now I run some executable. Processor reads instructions from the executable file and preforms these instructions. Even though I can put whatever instructions I want into the file, my program can't read arbitrary areas of HDD (and do many other potentially "bad" things).
It looks like magic, but I understand how this magic works. Operating system starts my program and puts the processor into some "unprivileged" state. "Unsafe" processor instructions are not allowed in this state and the only way to put the processor back to "privileged" state is give the control back to kernel. Kernel code can use all the processor's instructions, so it can do potentially unsafe things my program "asked" for if it decides it is allowed.
Now suppose we have VMWare or VirtualBox on Windows host. Guest operating system is Linux. I run a program in guest, it transfers control to guest Linux kernel. The guest Linux kernel's code is supposed to be run in processor's "privileged" mode (it must contain the "unsafe" processor instructions!). But I strongly doubt that it has an unlimited access to all the computer's resources.
I do not need too much technical details, I only want to understand how this part of magic works.
This is a great question and it really hits on some cool details regarding security and virtualization. I'll give a high-level overview of how things work on an Intel processor.
How are normal processes managed by the operating system?
An Intel processor has 4 different "protection rings" that it can be in at any time. The ring that code is currently running in determines the specific assembly instructions that may run. Ring 0 can run all of the privileged instructions whereas ring 3 cannot run any privileged instructions.
The operating system kernel always runs in ring 0. This ring allows the kernel to execute the privileged instructions it needs in order to control memory, start programs, write to the HDD, etc.
User applications run in ring 3. This ring does not permit privileged instructions (e.g. those for writing to the HDD) to run. If an application attempts to run a privileged instruction, the processor will take control from the process and raise an exception that the kernel will handle in ring 0; the kernel will likely just terminate the process.
Rings 1 and 2 tend not to be used, though they have a use.
Further reading
How does virtualization work?
Before there was hardware support for virtualization, a virtual machine monitor (such as VMWare) would need to do something called binary translation (see this paper). At a high level, this consists of the VMM inspecting the binary of the guest operating system and emulating the behavior of the privileged instructions in a safe manner.
Now there is hardware support for virtualization in Intel processors (look up Intel VT-x). In addition to the four rings mentioned above, the processor has two states, each of which contains four rings: VMX root mode and VMX non-root mode.
The host operating system and its applications, along with the VMM (such as VMWare), run in VMX root mode. The guest operating system and its applications run in VMX non-root mode. Again, both of these modes each have their own four rings, so the host OS runs in ring 0 of root mode, the host OS applications run in ring 3 of root mode, the guest OS runs in ring 0 of non-root mode, and the guest OS applications run in ring 3 of non-root mode.
When code that is running in ring 0 of non-root mode attempts to execute a privileged instruction, the processor will hand control back to the host operating system running in root mode so that the host OS can emulate the effects and prevent the guest from having direct access to privileged resources (or in some cases, the processor hardware can just emulate the effect itself without getting the host involved). Thus, the guest OS can "execute" privileged instructions without having unsafe access to hardware resources - the instructions are just intercepted and emulated. The guest cannot just do whatever it wants - only what the host and the hardware allow.
Just to clarify, code running in ring 3 of non-root mode will cause an exception to be sent to the guest OS if it attempts to execute a privileged instruction, just as an exception will be sent to the host OS if code running in ring 3 of root mode attempts to execute a privileged instruction.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With