I have a small application which sends files over the network to an agent located on a Windows OS.
When this application runs on Windows, everything works fine, the communication is OK and the files are all copied successfully.
But, when this application runs on Linux (RedHat 5.3, the receiver is still Windows) - I see in Wireshark network trace messages of TCP Zero Window and TCP Window Full to appear on each 1-2 seconds. The agent then closes the connection after some minutes.
The Windows - Linux code is almost the same, and pretty simple. The only non-trivial operation is setsockopt with SO_SNDBUF and value of 0xFFFF. Removing this code didn't help.
Can someone please help me with this issue?
EDIT: adding the sending code - it looks that it handles properly partial writes:
int totalSent=0;
while(totalSent != dataLen)
{
int bytesSent
= ::send(_socket,(char *)(data+totalSent), dataLen-totalSent, 0);
if (bytesSent ==0) {
return totalSent;
}
else if(bytesSent == SOCKET_ERROR){
#ifdef __WIN32
int errcode = WSAGetLastError();
if( errcode==WSAEWOULDBLOCK ){
#else
if ((errno == EWOULDBLOCK) || (errno == EAGAIN)) {
#endif
}
else{
if( !totalSent ) {
totalSent = SOCKET_ERROR;
}
break;
}
}
else{
totalSent+=bytesSent;
}
}
}
Thanks in advance.
Not seeing your code I'll have to guess.
The reason you get a Zero window in TCP is because there is no room in the receiver's recv buffer.
There are a number of ways this can occur. One common cause of this problem is when you are sending over a LAN or other relatively fast network connection and one computer is significantly faster than the other computer. As an extreme example, say you've got a 3Ghz computer sending as fast as possible over a Gigabit Ethernet to another machine that's running a 1Ghz cpu. Since the sender can send much faster than the receiver is able to read then the receiver's recv buffer will fill up causing the TCP stack to advertise a Zero window to the sender.
Now this can cause problems on both the sending and receiving sides if they're not both ready to deal with this. On the sending side this can cause the send buffer to fill up and calls to send either to block or fail if you're using non-blocking I/O. On the receiving side you could be spending so much time on I/O that the application has no chance to process any of it's data and giving the appearance of being locked up.
Edit
From some of your answers and code it sounds like your app is single threaded and you're trying to do non-Blocking sends for some reason. I assume you're setting the socket to non-Blocking in some other part of the code.
Generally, I would say that this is not a good idea. Ideally, if you're worried about your app hanging on a send(2)
you should set a long timeout on the socket using setsockopt
and use a separate thread for the actual sending.
See socket(7):
SO_RCVTIMEO and SO_SNDTIMEO Specify the receiving or sending timeouts until reporting an error. The parameter is a struct timeval. If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred; if no data has been transferred and the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK just as if the socket was specified to be nonblocking. If the timeout is set to zero (the default) then the operation will never timeout.
Your main thread can push each file descriptor into a queue
using say a boost mutex for queue access, then start 1 - N threads to do the actual sending using blocking I/O with send timeouts.
Your send function should look something like this ( assuming you're setting a timeout ):
// blocking send, timeout is handled by caller reading errno on short send
int doSend(int s, const void *buf, size_t dataLen) {
int totalSent=0;
while(totalSent != dataLen)
{
int bytesSent
= send(s,((char *)data)+totalSent, dataLen-totalSent, MSG_NOSIGNAL);
if( bytesSent < 0 && errno != EINTR )
break;
totalSent += bytesSent;
}
return totalSent;
}
The MSG_NOSIGNAL
flag ensures that your application isn't killed by writing to a socket that's been closed or reset by the peer. Sometimes I/O operations are interupted by signals, and checking for EINTR
allows you to restart the send
.
Generally, you should call doSend
in a loop with chunks of data that are of TCP_MAXSEG
size.
On the receive side you can write a similar blocking recv function using a timeout in a separate thread.
A common mistake when developing with TCP sockets is about incorrect assumption about read()/write() behavior.
When you perform a read/write operation you must check the return value, they may not have read/write the requested of bytes, you usually need a loop to keep track and make sure the entire data was transfered.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With