Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will read() ever block after select()?

Tags:

linux

sockets

I'm reading a stream of data through TCP/IP socket. The stream load is very uneven. Sometimes large bulks of data arrive every second, sometimes no data come for an hour. In the case of long inactivity period (no data from remote server, but connection is still online) my program should take some actions.

I'm implementing a timeout using a select(). It tells me if there are data ready, but I don't know exactly how much can I read without causing read() to block. Blocking is unacceptable as it may last far longer than the timeout I need.

For the sake of efficiency, stream is read into large buffer and read() call is provided with that buffer size.

Will read() block after select() if the buffer to be filled is greater than amount of data available right now in the socket?

like image 865
Basilevs Avatar asked Mar 18 '11 12:03

Basilevs


People also ask

Is read () blocking?

By default, read() waits until at least one byte is available to return to the application; this default is called “blocking” mode. Alternatively, individual file descriptors can be switched to “non-blocking” mode, which means that a read() on a slow file will return immediately, even if no bytes are available.

Does select () block?

When you return to select() it blocks, waiting for more data. However your peer on the other side of the connection is waiting for a response to the data already sent. Your program ends up blocking forever.

Is close () blocking?

Yes, close can block as well. If you don't want blocking behavior, use non-blocking sockets.


3 Answers

Actually it should not block (that is what select() is for!), but in fact, it might, exceptionally. Normally, read() should return up to the maximum number of bytes that you've specified, which possibly includes zero bytes (this is actually a valid thing to happen!), but it should never block after previously having reported readiness.

Nevertheless, see the Linux select man page:

Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks. This could for example happen when data has arrived but upon examination has wrong checksum and is discarded. There may be other circumstances in which a file descriptor is spuriously reported as ready. Thus it may be safer to use O_NONBLOCK on sockets that should not block.

like image 101
Damon Avatar answered Oct 23 '22 05:10

Damon


There is O_NONBLOCK which can be set by fcntl/F_SETFL and should result in non-blocking read.

like image 28
ony Avatar answered Oct 23 '22 06:10

ony


A blocking file descriptor will block on read() until there is something to read - could be one byte or your entire request. A non-blocking descriptor won't block on read() if there is nothing to read. Select() is not read(). It basically puts the process to sleep and monitors the file descriptor(s), including non-blocking descriptors. When there is activity on one of the descriptors (or the timeout period expires) select returns and you can read your data, or do something else in the case of the timeout.

So you have two separate issues. (1) You want to "take some actions" when there is no data. That's the select timeout. (2) Once you have data (notified by select) you don't want to block on a read. That's the non-blocking mode. When you get EAGAIN on the non-blocking read you loop back to the select and/or "take some actions" and loop back to select.

like image 2
Duck Avatar answered Oct 23 '22 06:10

Duck