When does a read() on a TCP socket return

Question

Can someone please explain, when exactly the read-function I use to get data from a TCP-socket does return?

I use the code below for reading from a measurement system. This system delivers data with a frequency of 15 Hz. READ_TIMEOUT_MS has a value of 200 Furthermore READ_BUFFER_SIZE has a value of 40000. All works fine, but what happens is, read() returns 15 times a second with 1349 bytes read.

By reading Pitfall 5 in the following documentation I would have expected, that the buffer is filled up completely:

http://www.ibm.com/developerworks/library/l-sockpit/

Init:

sock=socket(AF_INET, SOCK_STREAM, 0);
if (socket < 0)
{
    goto fail0;
}

struct sockaddr_in server;
server.sin_addr.s_addr = inet_addr(IPAddress);
server.sin_family = AF_INET;
server.sin_port = htons(Port);
if (connect(sock,(struct sockaddr *)&server, sizeof(server)))
{
    goto fail1;
}

struct timeval tv;
tv.tv_sec = READ_TIMEOUT_MS / 1000;
tv.tv_usec = (READ_TIMEOUT_MS % 1000) * 1000;
if (setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv, sizeof(struct timeval)))
{
    goto fail1;
}

return true;

fail1:
    close(sock);
    sock = -1;
fail0:
    return false;

Read:

unsigned char buf[READ_BUFFER_SIZE];
int len = read(sock, buf, sizeof(buf));
if (len <= 0)
{
    return NULL;
}

CBinaryDataStream* pData = new CBinaryDataStream(len);
pData->WriteToStream(buf, len);
return pData;

I hope this question is not a duplicate, because I searched for an answer before I asked. Please let me know if you need some further information.

Jens · Accepted Answer

I suspect that you are using Linux. The manpage for read says:

On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested;

TCP sockets model a byte-stream and not a block- or message-oriented protocol. Calling read on a socket returns if there are any data available in the application's buffer. In principle, the data arrives in the network card, is then transferred to the kernel space where it is processed by the kernel and the network stack. Finally, the read syscall gets the data from the kernel space and transfers it to user space.

When reading from a socket you have to expect an arbitrary number of bytes that can be read. A call to read returns as soon as there is anything in the read buffer or when an error occurred. You cannot predict or assume how many bytes may be available.

In addition, the call can return without reading anything because the OS has been interrupted. This happens quite often when debug or profile your application. You have to handle this in your application layer.

The complete receiver path is surprisingly complex when you want to have high data rates or low latency. The kernel and NICs implement many optimizations to e.g. spread load over cores, increase locality and offload processing to the NIC. Here are some additional links you may find interesting:

https://www.lmax.com/blog/staff-blogs/2016/05/06/navigating-linux-kernel-network-stack-receive-path/
https://blog.cloudflare.com/how-to-achieve-low-latency/
http://blog.packagecloud.io/eng/2016/06/22/monitoring-tuning-linux-networking-stack-receiving-data
http://syuu.dokukino.com/2013/05/linux-kernel-features-for-high-speed.html

When does a read() on a TCP socket return

Tags:

c++

c

tcp

sockets

bushmills

1 Answers

Jens

Recent Activity

Donate For Us

When does a read() on a TCP socket return

Tags:

c++

c

tcp

sockets

bushmills

1 Answers

Jens

Related questions

Recent Activity

Donate For Us