Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Safe to have multiple processes writing to the same file at the same time? [CentOs 6, ext4]

I'm building a system where multiple slave processes are communicating via unix domain sockets, and they are writing to the same file at the same time. I have never studied filesystems or this specific filesystem (ext4), but it feels like there might be some danger here.

Each process writes to a disjoint subset of the output file (ie, there is no overlap in the blocks being written). For example, P1 writes to only the first 50% of the file and P2 writes only to the second 50%. Or perhaps P1 writes only the odd-numbered blocks while P2 writes the even-numbered blocks.

Is it safe to have P1 and P2 (running simultaneously on separate threads) writing to the same file without using any locking? In other words, does the filesystem impose some kind of locking implicitly?

Note: I'm unfortunately not at liberty to output multiple files and join them later.

Note: My reading since posting this question does not agree with the only posted answer below. Everything I've read suggests that what I want to do is fine, whereas the respondent below insists what I am doing is unsafe, but I am unable to discern the described danger.

like image 683
Fixee Avatar asked Oct 20 '11 21:10

Fixee


People also ask

What will happen if 2 processes read/write to the same file?

If you try to read at the same time someone else is writing, that's perfectly OK. The only issue is if you are trying to read a block that the writer is writing at the same time. In that cause, the data you get is unpredictable but you should be able to read.

Can multiple processes read the same file Linux?

Can multiple Java processes read the same file at the same time? Sure they can; and ultimately, it is the role of the OS anyway to ensure that each process/thread reads at its own pace, so you need not worry about it.

Can multiple processes open the same file?

During the actual reading and writing, yes. But multiple processes can open the same file at the same time, then write back. It's up to the actual process to ensure they don't do anything nasty. If your writing the processes, look into flock (file lock).


1 Answers

What you're doing seems perfectly OK, provided you're using the POSIX "raw" IO syscalls such as read(), write(), lseek() and so forth.

If you use C stdio (fread(), fwrite() and friends) or some other language runtime library which has its own userspace buffering, then the answer by "Tilo" is relevant, in that due to the buffering, which is to some extent outside your control, the different processes might overwrite each other's data.

Wrt OS locking, while POSIX states that writes or reads less than of size PIPE_BUF are atomic for some special files (pipes and FIFO's), there is no such guarantee for regular files. In practice, I think it's likely that IO's within a page are atomic, but there is no such guarantee. The OS only does locking internally to the extent that is necessary to protect its own internal data structures. One can use file locks, or some other interprocess communication mechanism, to serialize access to files. But, all this is relevant only of you have several processes doing IO to the same region of a file. In your case, as your processes are doing IO to disjoint sections of the file, none of this matters, and you should be fine.

like image 72
janneb Avatar answered Sep 23 '22 08:09

janneb