Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What can I do to avoid TCP Zero Window/ TCP Window Full on the receiver side?

I have a small application which sends files over the network to an agent located on a Windows OS.

When this application runs on Windows, everything works fine, the communication is OK and the files are all copied successfully.

But, when this application runs on Linux (RedHat 5.3, the receiver is still Windows) - I see in Wireshark network trace messages of TCP Zero Window and TCP Window Full to appear on each 1-2 seconds. The agent then closes the connection after some minutes.

The Windows - Linux code is almost the same, and pretty simple. The only non-trivial operation is setsockopt with SO_SNDBUF and value of 0xFFFF. Removing this code didn't help.

Can someone please help me with this issue?

EDIT: adding the sending code - it looks that it handles properly partial writes:

int totalSent=0;
while(totalSent != dataLen)
{
    int bytesSent 
        = ::send(_socket,(char *)(data+totalSent), dataLen-totalSent, 0);

    if (bytesSent ==0) {
        return totalSent;
    }
    else if(bytesSent == SOCKET_ERROR){
#ifdef __WIN32
        int errcode = WSAGetLastError();
        if( errcode==WSAEWOULDBLOCK ){
#else
            if ((errno == EWOULDBLOCK) || (errno == EAGAIN)) {
#endif
            }
            else{
                if( !totalSent ) {
                    totalSent = SOCKET_ERROR;
                }
                break;
            }
        }
        else{
            totalSent+=bytesSent;
        }
    }
}

Thanks in advance.

like image 830
rkellerm Avatar asked Aug 08 '10 07:08

rkellerm


2 Answers

Not seeing your code I'll have to guess.

The reason you get a Zero window in TCP is because there is no room in the receiver's recv buffer.

There are a number of ways this can occur. One common cause of this problem is when you are sending over a LAN or other relatively fast network connection and one computer is significantly faster than the other computer. As an extreme example, say you've got a 3Ghz computer sending as fast as possible over a Gigabit Ethernet to another machine that's running a 1Ghz cpu. Since the sender can send much faster than the receiver is able to read then the receiver's recv buffer will fill up causing the TCP stack to advertise a Zero window to the sender.

Now this can cause problems on both the sending and receiving sides if they're not both ready to deal with this. On the sending side this can cause the send buffer to fill up and calls to send either to block or fail if you're using non-blocking I/O. On the receiving side you could be spending so much time on I/O that the application has no chance to process any of it's data and giving the appearance of being locked up.

Edit

From some of your answers and code it sounds like your app is single threaded and you're trying to do non-Blocking sends for some reason. I assume you're setting the socket to non-Blocking in some other part of the code.

Generally, I would say that this is not a good idea. Ideally, if you're worried about your app hanging on a send(2) you should set a long timeout on the socket using setsockopt and use a separate thread for the actual sending.

See socket(7):

SO_RCVTIMEO and SO_SNDTIMEO Specify the receiving or sending timeouts until reporting an error. The parameter is a struct timeval. If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred; if no data has been transferred and the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK just as if the socket was specified to be nonblocking. If the timeout is set to zero (the default) then the operation will never timeout.

Your main thread can push each file descriptor into a queue using say a boost mutex for queue access, then start 1 - N threads to do the actual sending using blocking I/O with send timeouts.

Your send function should look something like this ( assuming you're setting a timeout ):

// blocking send, timeout is handled by caller reading errno on short send
int doSend(int s, const void *buf, size_t dataLen) {    
    int totalSent=0;

    while(totalSent != dataLen)
    {
        int bytesSent 
            = send(s,((char *)data)+totalSent, dataLen-totalSent, MSG_NOSIGNAL);

        if( bytesSent < 0 && errno != EINTR )
            break;

        totalSent += bytesSent;
    }
    return totalSent;
}

The MSG_NOSIGNAL flag ensures that your application isn't killed by writing to a socket that's been closed or reset by the peer. Sometimes I/O operations are interupted by signals, and checking for EINTR allows you to restart the send.

Generally, you should call doSend in a loop with chunks of data that are of TCP_MAXSEG size.

On the receive side you can write a similar blocking recv function using a timeout in a separate thread.

like image 118
Robert S. Barnes Avatar answered Nov 15 '22 12:11

Robert S. Barnes


A common mistake when developing with TCP sockets is about incorrect assumption about read()/write() behavior.

When you perform a read/write operation you must check the return value, they may not have read/write the requested of bytes, you usually need a loop to keep track and make sure the entire data was transfered.

like image 21
João Pinto Avatar answered Nov 15 '22 12:11

João Pinto