Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make file creation an atomic operation?

I am using Python to write chunks of text to files in a single operation:

open(file, 'w').write(text) 

If the script is interrupted so a file write does not complete I want to have no file rather than a partially complete file. Can this be done?

like image 816
hoju Avatar asked Feb 25 '10 12:02

hoju


People also ask

Are file operations atomic?

An atomic file operation is an operation that cannot be interrupted or "partially" performed. Either the entire operation is performed or the operation fails.

What are atomic files?

Atomic file guarantees file integrity by ensuring that a file has been completely written and sync'd to disk before renaming it to the original file.

What is atomic file move?

ATOMIC_MOVE – Performs the move as an atomic file operation. If the file system does not support an atomic move, an exception is thrown. With an ATOMIC_MOVE you can move a file into a directory and be guaranteed that any process watching the directory accesses a complete file.


2 Answers

Write data to a temporary file and when data has been successfully written, rename the file to the correct destination file e.g

f = open(tmpFile, 'w') f.write(text) # make sure that all data is on disk # see http://stackoverflow.com/questions/7433057/is-rename-without-fsync-safe f.flush() os.fsync(f.fileno())  f.close()  os.rename(tmpFile, myFile) 

According to doc http://docs.python.org/library/os.html#os.rename

If successful, the renaming will be an atomic operation (this is a POSIX requirement). On Windows, if dst already exists, OSError will be raised even if it is a file; there may be no way to implement an atomic rename when dst names an existing file

also

The operation may fail on some Unix flavors if src and dst are on different filesystems.

Note:

  • It may not be atomic operation if src and dest locations are not on same filesystem

  • os.fsync step may be skipped if performance/responsiveness is more important than the data integrity in cases like power failure, system crash etc

like image 96
Anurag Uniyal Avatar answered Sep 21 '22 06:09

Anurag Uniyal


A simple snippet that implements atomic writing using Python tempfile.

with open_atomic('test.txt', 'w') as f:     f.write("huzza") 

or even reading and writing to and from the same file:

with open('test.txt', 'r') as src:     with open_atomic('test.txt', 'w') as dst:         for line in src:             dst.write(line) 

using two simple context managers

import os import tempfile as tmp from contextlib import contextmanager  @contextmanager def tempfile(suffix='', dir=None):     """ Context for temporary file.      Will find a free temporary filename upon entering     and will try to delete the file on leaving, even in case of an exception.      Parameters     ----------     suffix : string         optional file suffix     dir : string         optional directory to save temporary file in     """      tf = tmp.NamedTemporaryFile(delete=False, suffix=suffix, dir=dir)     tf.file.close()     try:         yield tf.name     finally:         try:             os.remove(tf.name)         except OSError as e:             if e.errno == 2:                 pass             else:                 raise  @contextmanager def open_atomic(filepath, *args, **kwargs):     """ Open temporary file object that atomically moves to destination upon     exiting.      Allows reading and writing to and from the same filename.      The file will not be moved to destination in case of an exception.      Parameters     ----------     filepath : string         the file path to be opened     fsync : bool         whether to force write the file to disk     *args : mixed         Any valid arguments for :code:`open`     **kwargs : mixed         Any valid keyword arguments for :code:`open`     """     fsync = kwargs.get('fsync', False)      with tempfile(dir=os.path.dirname(os.path.abspath(filepath))) as tmppath:         with open(tmppath, *args, **kwargs) as file:             try:                 yield file             finally:                 if fsync:                     file.flush()                     os.fsync(file.fileno())         os.rename(tmppath, filepath) 
like image 37
Nils Werner Avatar answered Sep 22 '22 06:09

Nils Werner