I know this is a bit of theoretical question but haven't got any satisfactory answer yet. So thought to put this question here. I have multiple C++ processes (would also like to know thread behaviour) which contend to replace the same file at the same time. How much is it safe to do in Linux (Using Ubuntu 14.04 and Centos 7)? Do I need to put locks?
Thanks in advance.
The filesystems of Unix-based OS's like Linux are designed around the notion of inodes, which are internal records describing various metadata about the file. Normally these aren't interacted with directly by users or programs, but their presence gives these filesystems a level of indirection that allows them to provide some useful semantics that other OS's (read: Windows) cannot.
filename --> inode --> data
In particular, when a file gets deleted, what's actually happening is the separation of the file's inode from its filename; not (necessarily) the deletion of the file's data itself. That is, the file and its contents can continue to exist (albeit invisibly, from the user's point of view) until all processes have closed their file-handles that were open on that file; once the inode is no longer accessible to any process, only then will the filesystem actually mark the file's data-blocks as free and available-for-reuse. In the meantime, the filename becomes available for another file's inode (and data) to be associated with, even though the old file's inode/data still technically exists.
The upshot of that is that under Linux it's perfectly valid to delete (or rename) a file at any time, even if other threads/processes are in the middle of using it; your delete will succeed, and any other programs that have that file open at that instant can simply continue reading/writing/using it, exactly as if it hadn't been deleted. The only thing that is different is that the filename will no longer appear in its directory, and when they call fclose() (or close() or etc) on the file, the file's data will go away.
Since doing mv new.txt old.txt is essentially the same as doing a rm old.txt ; mv new.txt old.txt, there should be no problems with doing this from multiple threads without any synchronization. (note that the slightly different situation of having multiple threads or processes opening the same file simultaneously and writing into it at the same time is a bit more perilous; nothing will crash, but it would be easy for them to overwrite each other's data and corrupt the file, if they aren't careful)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With