Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Print kernel's page table entries

Tags:

linux

Virtual memory map with 4 level page tables:

0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
ffffffff80000000 - ffffffffa0000000 (=512 MB)  kernel text mapping, from phys 0
ffffffffa0000000 - fffffffffff00000 (=1536 MB) module mapping space

I know that the kernel tries to map directly physical addresses to virtual addresses starting from PAGE_OFFSET for the direct mapping region

ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory

but I do not know where the code in kernel keeps the page tables to manage this direct mapping region and how to print out all the page table entries of 4 level page tables in this direct mapping region. Do you know how to print them?

like image 448
Naruto Avatar asked Nov 19 '13 10:11

Naruto


1 Answers

Naruto:

I was interested in printing Linux's page tables as well and I stumbled upon the nifty utility in the kernel source

arch/x86/mm/dump_pagetables.c

If you build your kernel with CONFIG_X86_PTDUMP set, you can do

cat /sys/kernel/debug/kernel_page_tables

And it will walk the kernel's page table hierarchy and print it out. There are 2 issues with the output however:

  1. It only lists the virtual addresses and not their associated physical address. This disappointed me because I was mostly interested in the virtual mappings into non-RAM system adresses (like PCI-space, etc) -- in other works I wanted to visually match up what I was seeing in /proc/iomem with /sys/kernel/debug/kernel_page_tables.
  2. It doesn't provide a way to print out the per-process page tables. As I understand, each linux process and init_level_4_pgt (the kernel's original (post-bootup) page table) share a large portion of their virtual address spaces: from PAGE_OFFSET on wards to be exact. I guess this is done so kernel code running in the context of a user process (via a system call) can access kernel memory. At any rate I, thought it would be pretty neat to see all this in action.

Here is the link to my patched 3.13.11 kernel with these 2 issues fixed (the branch with the fix is page_table_print).

To use, simply do

echo <pid_of_interest> > /proc/sys/debug/pgt_dump_process_id

where pid_of_interest if the pid of the process whose page table's you wish to monitor. NOTE: by doing

echo -1 > /proc/sys/debug/pgt_dump_process_id

this will dump the kernel's page tables. Here is an example output for the kernel's page tables:

cat /sys/kernel/debug/kernel_page_tables  | more
CR3= 0x196b70000, va_CR3 = 0xffff880196b70000
Page tables for kernelPGT localtion in memory: phs = 0x1c0e000, virt  = 0xffffffff81c0e000
---[ User Space ]---
0x0000000000000000-0xffff800000000000, phy: 0x0000000000000000-0x0000800001c0e000 16777088T                           pgd
---[ Kernel Space ]---
0xffff800000000000-0xffff880000000000, phy: 0x0000800001c0e000-0x0000000001fe1000        8T                           pgd
---[ Low Kernel Mapping ]---
0xffff880000000000-0xffff880000096000, phy: 0x0000000001fe1000-0x0000000002077000      600K     RW             GLB NX pte
0xffff880000096000-0xffff880000097000, phy: 0x0000000002077000-0x0000000002078000        4K     ro             GLB NX pte
0xffff880000097000-0xffff880000098000, phy: 0x0000000002078000-0x0000000002079000        4K     ro             GLB x  pte
0xffff880000098000-0xffff880000200000, phy: 0x0000000002079000-0x00000000021e0000     1440K     RW             GLB NX pte
0xffff880000200000-0xffff880001000000, phy: 0x00000000021e0000-0x0000000002fe0000       14M     RW         PSE GLB NX pmd
0xffff880001000000-0xffff880001600000, phy: 0x0000000002fe0000-0x00000001b14a1000        6M     ro         PSE GLB NX pmd
0xffff880001600000-0xffff880001734000, phy: 0x00000001b14a1000-0x00000001b15d5000     1232K     ro             GLB NX pte
0xffff880001734000-0xffff880001800000, phy: 0x00000001b15d5000-0x00000000037e0000      816K     RW             GLB NX pte
0xffff880001800000-0xffff880001a00000, phy: 0x00000000037e0000-0x00000001b14a2000        2M     ro         PSE GLB NX pmd

And here is the output after doing

echo 1 > /proc/sys/debug/pgt_dump_process_id


cat /sys/kernel/debug/kernel_page_tables  | more
CR3= 0x17312f000, va_CR3 = 0xffff88017312f000
Page tables for process id = 1
PGT localtion in memory: phs = 0x3623f000, virt  = 0xffff88003623f000
---[ User Space ]---
0x0000000000000000-0x00007f8000000000, phy: 0x0000000000000000-0x0000000036277000   130560G                           pgd
0x00007f8000000000-0x00007ff200000000, phy: 0x0000000036277000-0x000000003627a000      456G                           pud
0x00007ff200000000-0x00007ff207800000, phy: 0x000000003627a000-0x0000000036097000      120M                           pmd
0x00007ff207800000-0x00007ff2079b5000, phy: 0x0000000036097000-0x000000003624c000     1748K                           pte
0x00007ff2079b5000-0x00007ff2079b8000, phy: 0x000000003624c000-0x000000003624f000       12K USR ro                 x  pte
0x00007ff2079b8000-0x00007ff207bbf000, phy: 0x000000003624f000-0x00000000363c3000     2076K                           pte
0x00007ff207bbf000-0x00007ff207bc1000, phy: 0x00000000363c3000-0x00000000363c5000        8K USR ro                 NX pte
0x00007ff207bc1000-0x00007ff207bc4000, phy: 0x00000000363c5000-0x00000000363c8000       12K USR ro                 x  pte
0x00007ff207bc4000-0x00007ff207bc6000, phy: 0x00000000363c8000-0x00000000363ca000        8K                           pte
0x00007ff207bc6000-0x00007ff207bc7000, phy: 0x00000000363ca000-0x00000000363cb000        4K USR ro                 x  pte
0x00007ff207bc7000-0x00007ff207dcb000, phy: 0x00000000363cb000-0x00000000363d0000     2064K                           pte
0x00007ff207dcb000-0x00007ff207dcd000, phy: 0x00000000363d0000-0x00000000363d2000        8K USR ro                 NX pte
0x00007ff207dcd000-0x00007ff207dd2000, phy: 0x00000000363d2000-0x00000000363d7000       20K USR ro                 x  pte
0x00007ff207dd2000-0x00007ff207fe3000, phy: 0x00000000363d7000-0x00000000363fe000     2116K                           pte
0x00007ff207fe3000-0x00007ff207fe5000, phy: 0x00000000363fe000-0x0000000036400000        8K USR ro                 NX pte
0x00007ff207fe5000-0x00007ff207fe7000, phy: 0x0000000036400000-0x0000000036402000        8K                           pte
0x00007ff207fe7000-0x00007ff207fec000, phy: 0x0000000036402000-0x0000000036407000       20K USR ro                 x  pte
0x00007ff207fec000-0x00007ff207fed000, phy: 0x0000000036407000-0x0000000036408000        4K                           pte
0x00007ff207fed000-0x00007ff207fef000, phy: 0x0000000036408000-0x000000003640a000        8K USR ro                 x  pte
0x00007ff207fef000-0x00007ff2081ef000, phy: 0x000000003640a000-0x0000000036214000        2M                           pte
0x00007ff2081ef000-0x00007ff2081f1000, phy: 0x0000000036214000-0x0000000036216000        8K USR ro                 NX pte

I think this is pretty neat, I hope others think the same. :)

like image 124
rmccabe3701 Avatar answered Sep 20 '22 15:09

rmccabe3701