Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

simultaneous read on file descriptor from two threads

  1. my question: in Linux (and in FreeBsd, and generally in UNIX) is it possible/legal to read single file descriptor simultaneously from two threads?

  2. I did some search but found nothing, although a lot of people ask like question about reading/writing from/to socket fd at the same time (meaning reading when other thread is writing, not reading when other is reading). I also have read some man pages and got no clear answer on my question.

  3. Why I ask it. I tried to implement simple program that counts lines in stdin, like wc -l. I actually was testing my home-made C++ io engine for overhead, and discovered that wc is 1.7 times faster. I trimmed down some C++ and came closer to wc speed but didn't reach it. Then I experimented with input buffer size, optimized it, but still wc is clearly a bit faster. Finally I created 2 threads which read same STDIN_FILENO in parallel, and this at last was faster than wc! But lines count became incorrect... so I suppose some junk comes from reads which is unexpected. Doesn't kernel care what process read?

Edit: I did some research and discovered just that calling read directly via syscall does not change anything. Kernel code seem to do some sync handling, but i didnt understand much (read_write.c)

like image 371
jarero Avatar asked Feb 20 '11 14:02

jarero


3 Answers

When used with a descriptor (fd), read() and write() rely on the internal state of the fd to know the "current offset" at which the read and write will occur. As a result, they aren't thread-safe.

To allow a single descriptor to be used by multiple threads simultaneously, pread() and pwrite() are provided. With those interfaces, the descriptor and the desired offset are specified, so the "current offset" in the descriptor isn't used.

like image 165
Tom Avatar answered Nov 14 '22 22:11

Tom


That's undefined behavior, POSIX says:

The read() function shall attempt to read nbyte bytes from the file associated with the open file descriptor, fildes, into the buffer pointed to by buf. The behavior of multiple concurrent reads on the same pipe, FIFO, or terminal device is unspecified.

like image 33
jarero Avatar answered Nov 14 '22 23:11

jarero


About accessing a single file descriptor concurrently (i.e. from multiple threads or even processes), I'm going to cite POSIX.1-2008 (IEEE Std 1003.1-2008), Subsection 2.9.7 Thread Interactions with Regular File Operations:

2.9.7 Thread Interactions with Regular File Operations

All of the following functions shall be atomic with respect to each other in the effects specified in POSIX.1-2008 when they operate on regular files or symbolic links:

[…] read() […]

If two threads each call one of these functions, each call shall either see all of the specified effects of the other call, or none of them. […]

At first glance, this looks quite good. However, I hope you did not miss the restriction when they operate on regular files or symbolic links.

@jarero cites:

The behavior of multiple concurrent reads on the same pipe, FIFO, or terminal device is unspecified.

So, implicitly, we're agreeing, I assume: It depends on the type of the file you are reading. You said, you read from STDIN. Well, if your STDIN is a plain file, you can use concurrent access. Otherwise you shouldn't.

like image 26
hagello Avatar answered Nov 14 '22 23:11

hagello