Another socket problem.
In my client code, I am sending some packet and expectign some response from the server side:
send()
Immediately after send(), the server crashes and rebooted itself. In the meantime the recv() is waiting. But even after the server is up, the receive call is hanging. I have added SIGPIPE signal handling but its still not able to recognize that the socket is broken.
When i cancel the operation, i got the error from recv() that interrupt has been issued.
Anyone could help me how to rectify this error?
This is in a shared library running on Solaris machine.
recv(IPC, Buffer, int n) is a blocking call, that is, if data is available it writes it to the buffer and immediately returns true, and if no data is available it waits for at least n seconds to receive any data.
The recv() call blocks until data arrives on the socket. While it is blocked, other threads that are waiting for the lock are also blocked. Although this example is specific to network I/O, the recv() call could be replaced with any blocking call, and the same behavior would occur.
Return value If no error occurs, recv returns the number of bytes received and the buffer pointed to by the buf parameter will contain this data received. If the connection has been gracefully closed, the return value is zero.
For windows, you can use ioctlsocket() to set the socket in non-blocking mode.
May be you should set a timeout delay in order to manage this case. It can easily done by using setsockopt and setting SO_RECVTIMEO flag on your socket:
struct timeval tv;
tv.tv_sec = 30;
tv.tv_usec = 0;
if (setsockopt(socket_fd, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv, sizeof tv))
{
perror("setsockopt");
return -1;
}
Another possibility is to use non blocking sockets and manage read/write stuff with poll(2) or select(2). You should take a look on Beej's Guide to Network Programming.
As others have mentioned, you can use select() to set a time limit for the socket to become readable.
By default, the socket will become readable when there's one or more bytes available in the socket receive buffer. I say "by default" because this amount is tunable by setting the socket receive buffer "low water mark" using the SO_RCVLOWAT socket option.
Below is a function you can use to determine if the socket is ready to be read within a specified time limit. It will return 1 if the socket has data available for reading. Otherwise, it will return 0 if it times out.
The code is based on an example from the book Unix Network Programming (www.unpbook.com) that can provide you with more information.
/* Wait for "timeout" seconds for the socket to become readable */
readable_timeout(int sock, int timeout)
{
struct timeval tv;
fd_set rset;
int isready;
FD_ZERO(&rset);
FD_SET(sock, &rset);
tv.tv_sec = timeout;
tv.tv_usec = 0;
again:
isready = select(sock+1, &rset, NULL, NULL, &tv);
if (isready < 0) {
if (errno == EINTR) goto again;
perror("select"); _exit(1);
}
return isready;
}
Use it like this:
if (readable_timeout(sock, 5/*timeout*/)) {
recv(sock, ...)
You mention handling SIGPIPE on the client side which is separate issue. If you are getting this is means your client is writing to the socket, even after having received a RST from the server. That is a separate issue from having a problem with a blocking call to recv().
The way that could arise is that the server crashes and reboots, losing its TCP state. Your client sends data to the server which sends back a RST, since it no longer has state for the connection. Your client ignores the RST and tries to send more data and it's this second send() which causes your program to receive the SIGPIPE signal.
What error were you getting from the call to recv()?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With