on a linux box with plenty of memory (a few Gigs), I need to access randomly to a big file as fast as possible.
I was thinking about doing a cat myfile > /dev/null
before accessing it so my file pages go in memory sequentially, hence faster than with a dry random access.
Does this approach make sense to you?
While doing that may force the contents of the file into the system's cache, you are better off using posix_fadvise()
(with the POSIX_FADV_WILLNEED
advice) or the (blocking)readahead()
call to make the kernel precache the data you will need.
EDIT:
You might also want to try using the POSIX_FADV_RANDOM
advice to disable readahead altogether.
There's an article with a decent explanation of usage here: Advising the Linux Kernel on File I/O
As the others said, you'll need to benchmark it in your particular case.
It is quite possible it will result in a significant performance increase though. On traditional rotating media (i.e. a hard disk) sequential access (cat file > /dev/null/fadvise) is much faster than random access.
Only one way to be sure that any (possibly premature?) optimization is worthwhile: benchmark it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With