Assume there is a code like this.
MOV [0x1234], EAX (intel assembly)
let's say that CPU wants to process this instruction. and let's assume there is no hypervisor. we are just using normal x86 CPU(protected mode) in linux environment.
now, what I understand is that since 0x1234 is a virtual address, this needs to be translated into physical address.(let's skip segmentation part)
CPU just pass this address(0x1234) to MMU hardware. MMU walks the page table and access the memory contents using physical address.
am I correct?
now let's assume there is hypervisor and this code is running from guest OS.
what happens exactly??
I know that hypervisor provides another layer of page table. but I don't understand how exactly this works.
if the guest code "MOV [0x1234], EAX" is executed in real CPU. the virtual address 0x1234 will be translated by real hardware MMU. so I think this instruction has to be rewritten(0x1234 should be replaced to another address before the code is executed), or trapped to hypervisor...
am I wrong? if I am wrong, please fix my understanding...
thank you in advance.
Whenever workloads access data in memory, the system needs to look up the physical memory address that matches the virtual address. This is what we refer to as memory translations or mappings. To map virtual memory addresses to physical memory addresses, page tables are used.
Address TranslationWhen the system allocates a frame to any page, it translates this logical address into a physical address and create entry into the page table to be used throughout execution of the program. When a process is to be executed, its corresponding pages are loaded into any available memory frames.
A system using virtual memory uses a section of the hard drive to emulate RAM. With virtual memory, a system can load larger or multiple programs running at the same time, enabling each one to operate as if it has more space, without having to purchase more RAM.
The hypervisor virtualizes the guest physical memory to isolate virtual machines from each other and to provide a contiguous, zero-based memory space for each guest operating system, just as on non-virtualized systems.
Answering your first question: yes you are. That's basically how virtual memory works.
Now, let's see what's happen when an hypervisor is running between the MMU and a guest OS. For performance sake, an hypervisor (weither it's a type 1 or type 2) would try to avoid trapping at each guest OS memory access. The idea is to let the guest OS managing the MMU. I will elaborate with possible implementations, one for x86 and one for PowerPC.
On x86, from the Intel's manual 3B:
27.3.2 Guest & Host Physical Address Spaces
Memory virtualization provides guest software with contiguous guest physical address space starting zero and extending to the maximum address supported by the guest virtual processor’s physical address width. The VMM utilizes guest physical to host physical address mapping to locate all or portions of the guest physical address space in host memory. The VMM is responsible for the policies and algorithms for this mapping which may take into account the host system physical memory map and the virtualized physical memory map exposed to a guest by the VMM.
The VMM knows the current PDBR
base address of the VM (the PDBR
is hold in the CR3
register), since an access to CR3
will cause a VM_EXIT. The VMM will be able to maintain the real page directory on the behalf of the guest OS. I mean when the guest OS modifies its page directory to map logical address A to physical address B, the VMM traps on this, and instead of mapping A to B, it maps A to C. Thus any further access to A won't cause a #PF, it will be flawlessly routed to C through the MMU. The sad part on this approach is that the guest believes it has mapped A to B, but actually A is mapped to C, thus the VMM has to maintain a virtual page directory in case that the guest would read where it has previously mapped A. The VMM traps on this read access, and instead of saying A is mapped to B, it returns to the guest that A is mapped to C.
27.3.3 Virtualizing Virtual Memory by Brute Force
A simple-minded way to do this would be to ensure that all guest attempts to access address-translation hardware trap to the VMM where such operations can be properly emulated. It must ensure that accesses to page directories and page tables also get trapped. This may be done by protecting these in-memory structures with conventional page-based protection. The VMM can do this because it can locate the page directory because its base address is in CR3 and the VMM receives control on any change to CR3; it can locate the page tables because their base addresses are in the page directory.
On PowerPC, you don't have an hardware table-walk of a page directory as Intel's. Each modification of the TLB is induced by an instruction, usually from the kernel memory manager. Here again, a straightforward idea is to trap on each guest access to the TLB (setting up things to cause a VM exit when the guest execute a tlbwe
instruction for instance; note: tlbwe
writes an entry into the TLB). Once inside the VMM, the hypervisor decodes the trapping instruction, and emulates its behaviour, but instead of mapping A to B, it maps A to C, directly into the TLB. Again the VMM has to maintain a virtual TLB in case the guest OS wants to check what's in the TLB, and returns what it believes to have put in the TLB earlier.
To conclude, although some hardware features help in virtualizing the guest physical memory, it's generally up to the VMM to manage the effective guest-physical to host-physical memory mapping.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With