Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if file content has been actually written to disk - not being queued in disk controller's buffer

I wrote a program that compacts two small files into a single-bigger file. I first read data from input files, merge data, and write output to a temp file. Once this completes I rename the temp file to the desired file name (located in the same partition on disk). Here is pseudo code:

FILE* fp_1 = fopen("file_1.dat", "r+b");
FILE* fp_2 = fopen("file_2.dat", "r+b");
FILE* fp_out = fopen("file_tmp.dat", "w+b");

// 1. Read data for the key in two files
const char* data_1 = ...;
const char* data_2 = ...;

// 2. Merge data, store in an allocated buffer

// 3. Write merged buffer to temp file
fwrite(temp_buff, estimated_size, 1, fp_out);
fflush(fp_out);

fclose(fp_1);
fclose(fp_2);
fclose(fp_out);

// Now rename temp file to desired file name
if(std::rename("file_tmp.dat", "file_out.dat") == 0)
{
    std::remove("file_1.dat");
    std::remove("file_2.dat");
}

I repeatedly tested the program with two input files of 5 MBs each. One time I suddenly shutdown the system by unplugging the power cable. After restarting the system I checked the data and found that the input files were removed and the file_out.dat was filled with all zeros. This made me believe that the system went down right after 2 input files were removed and the output data was still somewhere in the disk controller's buffer. If this is true, then is there any way that I can check if the data has been actually written to disk?

like image 801
duong_dajgja Avatar asked Sep 30 '16 03:09

duong_dajgja


1 Answers

Not in the general case. Even if you tell the OS to wait until the data is written (with the sync API family), some disks lie to the OS, claiming the write finished when it's really just queued in the hard drive's onboard RAM cache, which will be lost on abrupt power loss.

The best you can do is explicitly ask the OS to tell the disk to "really, really sync everything and block until it's done" after you've performed the fflush (which only tells the stdio library to send all user-mode buffered data to the OS, which often keeps it in kernel buffers and syncs the kernel buffers to disk later, in the background), either limited scope with fsync or using something like sync or syncfs (the former syncs all file systems, the latter limits the scope to the file system corresponding to a single file descriptor).

For maximum safety, you'd want to:

  1. Do a targeted fsync after the final fflush but before the rename (so the new file is complete on disk before replacing the old one), and
  2. Do a broader sync/syncfs after the rename but before the remove calls (so the metadata updates from the rename are complete before you delete the source files)

Omitting step 1 is okay if you don't mind corrupted output data in cases where the input data still exists.

Of course, like I said, this is all best effort; if the disk controller is lying to the OS, there is nothing you can do shy of writing new firmware and drivers for the disk, which is probably going too far.

like image 141
ShadowRanger Avatar answered Nov 18 '22 12:11

ShadowRanger