Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can running 'cat' speed up subsequent file random access on a linux box?

on a linux box with plenty of memory (a few Gigs), I need to access randomly to a big file as fast as possible.

I was thinking about doing a cat myfile > /dev/null before accessing it so my file pages go in memory sequentially, hence faster than with a dry random access.

Does this approach make sense to you?

like image 453
jeje Avatar asked Aug 24 '09 09:08

jeje


3 Answers

While doing that may force the contents of the file into the system's cache, you are better off using posix_fadvise() (with the POSIX_FADV_WILLNEED advice) or the (blocking)readahead() call to make the kernel precache the data you will need.

EDIT: You might also want to try using the POSIX_FADV_RANDOM advice to disable readahead altogether. There's an article with a decent explanation of usage here: Advising the Linux Kernel on File I/O

like image 113
Hasturkun Avatar answered Sep 18 '22 09:09

Hasturkun


As the others said, you'll need to benchmark it in your particular case.

It is quite possible it will result in a significant performance increase though. On traditional rotating media (i.e. a hard disk) sequential access (cat file > /dev/null/fadvise) is much faster than random access.

like image 40
Kristof Provost Avatar answered Sep 22 '22 09:09

Kristof Provost


Only one way to be sure that any (possibly premature?) optimization is worthwhile: benchmark it.

like image 37
Paul Dixon Avatar answered Sep 22 '22 09:09

Paul Dixon