Let's say I am using c++ files stream asynchronously. I mean never using std::flush nor std::endl. My application writes a lot of data to a file and abruptly crashes down. Is the data remaining in the cache system flushed to the disk, or discarded (and lost)?
Buffers are the disk block representation of the data that is stored under the page cache. In addition, the buffer contains the metadata of the files or data which resides under the page cache. On the other hand, a cache is a temporary storage area to store frequently accessed data for rapid access.
The buffer cache consists of two kinds of data structures: A set of buffer heads describing the buffers in the cache. A hash table to help the kernel quickly derive the buffer head that describes the buffer associated with a given pair of device and block numbers.
Buffer memory is a temporary storage area in the main memory (RAM) that stores data transferring between two or more devices or between an application and a device. Buffering compensates for the difference in transfer speeds between the sender and receiver of the data.
Linux does automatically free memory used for cache and buffering. It does it on a page by page basis instead of clearing it all at once.
The stuff in the library buffer (which you flush with std::flush or such) is lost, the data in the OS kernel buffers (which you can flush e.g. with fsync()) is not lost unless the OS itself crashes.
Complicating this problem is that there are multiple 'caches' in play.
C++ streams have their own internal buffering mechanism. Streams don't ask the OS to write to disk until either (a) you've sent enough data into the buffer that the streams library thinks the write wouldn't be wasted (b) you ask for a flush specifically (c) the stream is in line-buffering mode, and you've sent along the endl
. Any data in these buffers are lost when the program crashes.
The OS will buffer writes to make best use of the limited amount of disk IO available. Writes will typically be flushed within five to thirty seconds; sooner if the programmer (or libraries) calls fdatasync(2)
or fsync(2)
or sync(2)
(which asks for all dirty data to be flushed). Any data in the OS buffers are written to disk (eventually) when the program crashes, lost if the kernel crashes.
The hard drive will buffer writes to try to make the best use of its slow head, rotational latency, etc. Data arrives in this buffer when the OS flushes its caches. Data in these buffers are written to disk when the program crashes, will probably be written to disk if the kernel crashes, and might be written to disk if the power is suddenly removed from the drive. (Some have enough power to continue writing their buffers, typically this would take less than a second anyway.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With