I am trying to understand the number of tables looked-up, when translating a virtual address to a physical address. The Intel manual seems to state numerous schemes:
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-system-programming-manual-325384.pdf
(section 4)
whereas Ulrich Drepper's paper says there are typically 4:
http://www.akkadia.org/drepper/cpumemory.pdf
(page 38)
Whereas I seem to find a lot of diagrams online only showing two:
Could somebody explain which how many page tables are typically accessed on an Intel CPU, or whether it depends on OS configuration (like Huge memory pages etc)?
The most commonly used number of page tables on x86-64 system is 4. This is assuming a 64-bit OS using what Intel calls IA-32e paging mode and 4 KB pages. Fewer page tables are needed for some of the other modes (as described in the Intel documentation).
Figure 4-1 from the Intel 64 and IA-32 Architectures Software Developer’s Manual shows the possible configurations. The columns to focus on are the two address width columns and the page sizes columns. The reason that you see so many different options is because each of these different combinations changes how the pages tables are used and how many are needed.
The most commonly used configuration on x86-64 systems today is the IA-32e paging mode with 4 KB pages and the details for how this works are shown in Figure 4-8.
The value in register CR3
points to the root of the paging structure. There is a 48-bit linear address (the program's virtual address) that needs to be translated to a physical address (with up-to 52 bits). The page offset for a 4 KB page is 12 bits, so that leaves 36-bits in the linear address to index into the page table. The more bits that are used to index into each table structure, the larger that table would need to be. What Intel has done is divide the page table into 4 levels, and each level is accessed with 9 index bits.
If you are using 2 MB pages then you have 21 bits to offset into the page. And so one of the table used in the translation step can be removed, while still keeping the other tables the same size (shown in Figure 4-9).
The other configurations follow the same pattern and you can look in the Intel manual for more detail if necessary.
I suspect that the reason you see diagrams online with only two levels is because that provides enough details to explain the overall concepts used in paging. The additional levels are simply an extension of the same concepts, but tuned for the particular address size and page table size that the architecture wants to support.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With