Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read file without disk caching in Linux

Tags:

linux

io

I have a C program that runs only weekly, and reads a large amount of files only once. Since Linux also caches everything that's read, they fill up the cache needlessly and this slows down the system a lot unless it has an SSD drive.

So how do I open and read from a file without filling up the disk cache?

Note:

By disk caching I mean that when you read a file twice, the second time it's read from RAM, not from disk. I.e. data once read from the disk is left in RAM, so subsequent reads of the same file will not need to reread the data from disk.

like image 867
sashoalm Avatar asked Mar 07 '13 08:03

sashoalm


People also ask

What does .cache do in Linux?

The cache in Linux is called Page Cache. This is to make overall performance faster. During Linux read system calls, the kernel checks if the cache contains the requested blocks of data. If it does, then that would be a successful cache hit. The cache returns this data without doing any I/O to the disk system.

Does Linux cache files?

Linux maintains four caches of I/O data: page cache, i-node cache, buffer cache and directory cache. Figure 5 shows these caches and how they interact with the kernel, each other and user level programs. The page cache combines virtual memory and file data. The i-node cache keeps recently accessed file i-nodes.


2 Answers

I believe passing O_DIRECT to open() should help:

O_DIRECT (Since Linux 2.4.10)

Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user space buffers. The O_DIRECT flag on its own makes at an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC that data and necessary metadata are transferred. To guarantee synchronous I/O the O_SYNC must be used in addition to O_DIRECT.

There are further detailed notes on O_DIRECT towards the bottom of the man page, including a fun quote from Linus.

like image 106
NPE Avatar answered Sep 20 '22 14:09

NPE


You can use posix_fadvise() with the POSIX_FADV_DONTNEED advice to request that the system free the pages you've already read.

like image 30
Hasturkun Avatar answered Sep 20 '22 14:09

Hasturkun