Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it safe to pipe the output of several parallel processes to one file using >>?

I'm scraping data from the web, and I have several processes of my scraper running in parallel.

I want the output of each of these processes to end up in the same file. As long as lines of text remain intact and don't get mixed up with each other, the order of the lines does not matter. In UNIX, can I just pipe the output of each process to the same file using the >> operator?

like image 745
conradlee Avatar asked Mar 14 '10 21:03

conradlee


People also ask

Can multiple processes use the same file?

no, generally it is not safe to do this! you need to obtain an exclusive write lock for each process -- that implies that all the other processes will have to wait while one process is writing to the file.. the more I/O intensive processes you have, the longer the wait time.

Can multiple processes write to a pipe?

Yes, multiple processes can read from (or write to) a pipe.

Can multiple processes read from the same pipe?

If multiple processes simultaneously write to the same pipe, data from one process can be interleaved with data from another process, if modules are pushed on the pipe or the write is greater than PIPE_BUF. The order of data written is not necessarily the order of data read.


1 Answers

No. It is not guaranteed that lines will remain intact. They can become intermingled.

From searching based on liori's answer I found this:

Write requests of {PIPE_BUF} bytes or less shall not be interleaved with data from other processes doing writes on the same pipe. Writes of greater than {PIPE_BUF} bytes may have data interleaved, on arbitrary boundaries, with writes by other processes, whether or not the O_NONBLOCK flag of the file status flags is set.

So lines longer than {PIPE_BUF} bytes are not guaranteed to remain intact.

like image 186
Mark Byers Avatar answered Sep 22 '22 05:09

Mark Byers