Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is rename() without fsync() safe?

Is it safe to call rename(tmppath, path) without calling fsync(tmppath_fd) first?

I want the path to always point to a complete file. I care mainly about Ext4. Is the rename() promised to be safe in all future Linux kernel versions?

A usage example in Python:

def store_atomically(path, data):     tmppath = path + ".tmp"     output = open(tmppath, "wb")     output.write(data)      output.flush()     os.fsync(output.fileno())  # The needed fsync().     output.close()     os.rename(tmppath, path) 
like image 401
Ivo Danihelka Avatar asked Sep 15 '11 15:09

Ivo Danihelka


2 Answers

No.

Look at libeatmydata, and this presentation:

Eat My Data: How Everybody Gets File IO Wrong

http://www.oscon.com/oscon2008/public/schedule/detail/3172

by Stewart Smith from MySql.

In case it is offline/no longer available, I keep a copy of it:

  • The video here
  • The presentation slides (online version of slides)
like image 199
sehe Avatar answered Sep 20 '22 12:09

sehe


From ext4 documentation:

When mounting an ext4 filesystem, the following option are accepted: (*) == default  auto_da_alloc(*)    Many broken applications don't use fsync() when  noauto_da_alloc     replacing existing files via patterns such as                     fd = open("foo.new")/write(fd,..)/close(fd)/                     rename("foo.new", "foo"), or worse yet,                     fd = open("foo", O_TRUNC)/write(fd,..)/close(fd).                     If auto_da_alloc is enabled, ext4 will detect                     the replace-via-rename and replace-via-truncate                     patterns and force that any delayed allocation                     blocks are allocated such that at the next                     journal commit, in the default data=ordered                     mode, the data blocks of the new file are forced                     to disk before the rename() operation is                     committed.  This provides roughly the same level                     of guarantees as ext3, and avoids the                     "zero-length" problem that can happen when a                     system crashes before the delayed allocation                     blocks are forced to disk. 

Judging by the wording "broken applications", it is definitely considered bad practice by the ext4 developers, but in practice it is so widely used approach that it was patched in ext4 itself.

So if your usage fits the pattern, you should be safe.

If not, I suggest you to investigate further instead of inserting fsync here and there just to be safe. That might not be such a good idea since fsync can be a major performance hit on ext3 (read).

On the other hand, flushing before rename is the correct way to do the replacement on non-journaling file systems. Maybe that's why ext4 at first expected this behavior from programs, the auto_da_alloc option was added later as a fix. Also this ext3 patch for the writeback (non-journaling) mode tries to help the careless programs by flushing asynchronously on rename to lower the chance of data loss.

You can read more about the ext4 problem here.

like image 44
user Avatar answered Sep 19 '22 12:09

user