Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is overwriting a small file atomic on ext4?

Assume we have a file of FILE_SIZE bytes, and:

  • FILE_SIZE <= min(page_size, physical_block_size);
  • file size never changes (i.e. truncate() or append write() are never performed);
  • file is modified only by completly overwriting its contents using:

    pwrite(fd, buf, FILE_SIZE, 0);
    

Is it guaranteed on ext4 that:

  1. Such writes are atomic with respect to concurrent reads?
  2. Such writes are transactional with respect to a system crash?

    (i.e., after a crash the file's contents is completely from some previous write and we'll never see a partial write or empty file)

Is the second true:

  • with data=ordered?
  • with data=journal or alternatively with journaling enabled for a single file?

    (using ioctl(fd, EXT4_IOC_SETFLAGS, EXT4_JOURNAL_DATA_FL))

  • when physical_block_size < FILE_SIZE <= page_size?


I've found related question which links discussion from 2011. However:

  • I didn't find an explicit answer for my question 2.
  • I wonder, if the above is true, is it documented somewhere?
like image 558
gavv Avatar asked Sep 29 '15 18:09

gavv


2 Answers

From my experiment it was not atomic.

Basically my experiment was to have two processes, one writer and one reader. The writer writes to a file in a loop and reader reads from the file

Writer Process:

char buf[][18] = {
    "xxxxxxxxxxxxxxxx",
    "yyyyyyyyyyyyyyyy"
};
i = 0;
while (1) {
   pwrite(fd, buf[i], 18, 0);
   i = (i + 1) % 2;
}

Reader Process

while(1) {
    pread(fd, readbuf, 18, 0);
    //check if readbuf is either buf[0] or buf[1]
}

After a while of running both processes, I could see that the readbuf is either xxxxxxxxxxxxxxxxyy or yyyyyyyyyyyyyyyyxx.

So it definitively shows that the writes are not atomic. In my case 16byte writes were always atomic.

The answer was: POSIX doesn't mandate atomicity for writes/reads except for pipes. The 16 byte atomicity that I saw was kernel specific and may/can change in future.

Details of the answer in the actual post: write(2)/read(2) atomicity between processes in linux

like image 86
Vineeth Pillai Avatar answered Sep 21 '22 16:09

Vineeth Pillai


I am familiar with theory about filesystems in general, not with implementation of Ext4. Take this as educated guess.

Yes, I believe one sector reads and writes will be atomic because

  • Link you provided quotes "Currently concurrent reads/writes are atomic only wrt individual pages, however are not on the system call. "
  • Disk sector (512 bytes) writes are atomic according to Stephen Tweedie. In private email conversation with him, he acknowledged that this guarantee is only as good as the hardware.
  • Ext filesystems overwrite data in place, no copy on write. No allocation.
  • There is some effort to implement inline data, very small files data can fit in the inode itself. If you only need to store few bytes, that may have impact.

Not sure about one page, but it would make little sense in full journaling mode to send less than a page to the journal before commiting.

like image 29
ArekBulski Avatar answered Sep 19 '22 16:09

ArekBulski