Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is write/fwrite guaranteed to be sequential?

Tags:

c

posix

libc

Is the data written via write (or fwrite) guaranteed to be persisted to the disk in a sequence manner? In particular in relation to fault tolerance. If the system should fail during the write, will it behave as though first bytes were written first and writing stopped mid-stream (as opposed to random blocks written).

Also, are sequential calls to write/fwrite guaranteed to be sequential? According to POSIX I find only that a call to read is guaranteed to consider a previous write.

I'm asking as I'm creating a fault tolerant data store that persists to disks. My logical order of writing is such that faults won't ruin the data, but if the logical order isn't being obeyed I have a problem.

Note: I'm not asking if persistence is guaranteed. Only that if my calls to write do eventually persist they obey the order in which I actually write.

like image 821
edA-qa mort-ora-y Avatar asked Oct 19 '22 13:10

edA-qa mort-ora-y


2 Answers

The POSIX docs for write() state that "If the O_DSYNC bit has been set, write I/O operations on the file descriptor shall complete as defined by synchronized I/O data integrity completion". Presumably, if the O_DSYNC bit isn't set, then the synchronization of I/O data integrity completion is unspecified. POSIX also says that "This volume of POSIX.1-2008 is also silent about any effects of application-level caching (such as that done by stdio)", so I think there is no guarantee for fwrite().

like image 190
Michael Burr Avatar answered Nov 08 '22 10:11

Michael Burr


I am not an expert, but I might know enough to point you in the right direction:

The most disastrous case is if you lose power, so that is the only one worth considering.

  • Start with a file with X bytes of meaningful content, and a header that indicates it.
  • Write Y bytes of meaningful content somewhere that won't invalidate X.
  • Call fsync (slow!).
  • Update the header (probably has to be less than your disk's block size).

I don't know if changing the length of a file is safe. I don't know how much depends on the filesystem mount mode, except that any "safe" mode is probably completely unusable for systems need to have even a slight level of performance.

Keep in mind that on some systems the fsync call lies and just returns without doing anything safely. You can tell because it returns quickly. For this reason, you need to make pretty large transactions (i.e. much larger than application-level transactions).

Keep in mind that the kind of people who solve this problem in the real world get paid high 6 figures at least. The best answer for the rest of us is either "just send the data to postgres and let it deal with it." or "accept that we might have to lose data and revert to an hourly backup."

like image 24
o11c Avatar answered Nov 08 '22 10:11

o11c