I want to see which pages are being accessed by my program. Now one way is to use mprotect
with SIGSEGV
handler to note down pages which are being accessed. However, this involves the overhead of setting protection bits for all the memory pages I'm interested in.
The second way that comes in mind is to invalidate the Translation Lookaside Buffer (TLB) in the beginning and then note down the misses. At each miss I will note down the addressed memory page and therefore note it down. Now the question is how to handle TLB misses in user space for a linux program.
And if you know even a faster method than either TLB misses or mprotect to note down dirtied memory pages, kindly let me know. Also, I want a solution for x86 only.
I want to see which pages are being accessed by my program.
You can simulate a CPU and get this data. Variants:
However, this involves the overhead of setting protection bits for all the memory pages
Is this overhead too big?
Now the question is how to handle TLB misses in user space for a linux program.
You cant handle a miss nor in user-space neither in kernel-space (on x86 and many other popular platforms). This is because most platforms manages TLB misses in hardware:. MMU (part of CPU/chipset) will do a walk on page tables and will get physical address transparently. Only if some bits are set or when the address region is not mapped, page fault interrupt is generated and delivered to kernel.
Also, seems there is no way to dump TLB in modern CPUs (but 386DX was able to to this)
You can try to detect TLB miss by the delay introduced. But this delay can be hided by Out-of-order start of TLB lookup.
Also, most hardware events (memory access, tlb access, tlb hits, tlb misses) are counted by hardware performance monitoring (this part of CPU is used by Vtune, CodeAnalyst and oprofile). Unfortunately, this is only a global counters for events and you can't activate more than 2-4 events at same time. The good news is that you can set the perfmon counter to interrupt when some count is reached. Then you will get (via interrupt) address of instruction ($eip), where the count was reached. So, you can find TLB-miss-heavy hot-spot with this hardware (it is in every modern x86 cpu; both intel and amd).
TLB is transparent to userspace program, at most you can count TLB misses by some performance counter (without addresses).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With