I am writing code for a real-time program running in an embedded Linux system. As it is critical that we don't stall unpredictably on page faults, I would like to prefault in the stack so that the region that we use is guaranteed to be covered by a mlockall()
call.
For the main thread this is simple enough; simply do a few big alloca()
s and make sure to do a write every few pages. This works because at program startup, the stack limit is much larger than the amount we need; we end up allocating exactly how much we prefault in.
However, for pthread stacks, will they be allocated using MAP_GROWSDOWN
as well? If so, what's the best way to prefault them in, considering that:
I'm aware that I can use pthread_attr_setstack
to pass in a manually-allocated stack, but this complicates cleaning up after the thread, and so I'd prefer to avoid this if possible.
As such, what's the best way to perform this prefaulting? It would be sufficient if there was an easy way to find out the lower bound of the stack (just above the guard page); at this point I could simply write to every page from there to the current stack pointer.
Note that portability is not a concern; we'd be happy to have a solution that works only under x86-32 and Linux.
If you use pthread_attr_setstacksize
you can still have automatic allocation with a known size.
glibc nptl leaves guard pages between the stacks, so you could also set a SEGV
handler and simply scribble until you fault and then longjmp
out of the loop. That'd be ugly!
Edit: A really nonportable way would be to open /proc/self/maps
to find your stacks!
yes. if you have called mlockall(MCL_CURRENT | MCL_FUTURE) before pthread_create, page fault for thread stack will happen when starting the thread. and after that, there will be no page fault again while accessing stack in the thread. so people always set the suitable thread size for the new created thread to avoid lock too much memory for the future coming threads. take a look at: https://rt.wiki.kernel.org/index.php/Threaded_RT-application_with_memory_locking_and_stack_handling_example
if you change the thread stack size to 7MB, you will see: Initial count : Pagefaults, Major:0 (Allowed >=0), Minor:190 (Allowed >=0) mlockall() generated : Pagefaults, Major:0 (Allowed >=0), Minor:393 (Allowed >=0) malloc() and touch generated : Pagefaults, Major:0 (Allowed >=0), Minor:25633 (Allowed >=0) 2nd malloc() and use generated: Pagefaults, Major:0 (Allowed 0), Minor:0 (Allowed 0)
Look at the output of ps -leyf, and see that the RSS is now about 100 [MB] Press to exit I am an RT-thread with a stack that does not generate page-faults during use, stacksize=7340032 Caused by creating thread : Pagefaults, Major:0 (Allowed >=0), Minor:1797 (Allowed >=0) Caused by using thread stack : Pagefaults, Major:0 (Allowed 0), Minor:0 (Allowed 0)
1797 page faults happen while creating thread, it is about 7MB. -barry
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With