I have a multi-threaded server (thread pool) that is handling a large number of requests (up to 500/sec for one node), using 20 threads. There's a listener thread that accepts incoming connections and queues them for the handler threads to process. Once the response is ready, the threads then write out to the client and close the socket. All seemed to be fine until recently, a test client program started hanging randomly after reading the response. After a lot of digging, it seems that the close() from the server is not actually disconnecting the socket. I've added some debugging prints to the code with the file descriptor number and I get this type of output.
Processing request for 21 Writing to 21 Closing 21
The return value of close() is 0, or there would be another debug statement printed. After this output with a client that hangs, lsof is showing an established connection.
SERVER 8160 root 21u IPv4 32754237 TCP localhost:9980->localhost:47530 (ESTABLISHED)
CLIENT 17747 root 12u IPv4 32754228 TCP localhost:47530->localhost:9980 (ESTABLISHED)
It's as if the server never sends the shutdown sequence to the client, and this state hangs until the client is killed, leaving the server side in a close wait state
SERVER 8160 root 21u IPv4 32754237 TCP localhost:9980->localhost:47530 (CLOSE_WAIT)
Also if the client has a timeout specified, it will timeout instead of hanging. I can also manually run
call close(21)
in the server from gdb, and the client will then disconnect. This happens maybe once in 50,000 requests, but might not happen for extended periods.
Linux version: 2.6.21.7-2.fc8xen Centos version: 5.4 (Final)
socket actions are as follows
SERVER:
int client_socket; struct sockaddr_in client_addr; socklen_t client_len = sizeof(client_addr); while(true) { client_socket = accept(incoming_socket, (struct sockaddr *)&client_addr, &client_len); if (client_socket == -1) continue; /* insert into queue here for threads to process */ }
Then the thread picks up the socket and builds the response.
/* get client_socket from queue */ /* processing request here */ /* now set to blocking for write; was previously set to non-blocking for reading */ int flags = fcntl(client_socket, F_GETFL); if (flags < 0) abort(); if (fcntl(client_socket, F_SETFL, flags|O_NONBLOCK) < 0) abort(); server_write(client_socket, response_buf, response_length); server_close(client_socket);
server_write and server_close.
void server_write( int fd, char const *buf, ssize_t len ) { printf("Writing to %d\n", fd); while(len > 0) { ssize_t n = write(fd, buf, len); if(n <= 0) return;// I don't really care what error happened, we'll just drop the connection len -= n; buf += n; } } void server_close( int fd ) { for(uint32_t i=0; i<10; i++) { int n = close(fd); if(!n) {//closed successfully return; } usleep(100); } printf("Close failed for %d\n", fd); }
CLIENT:
Client side is using libcurl v 7.27.0
CURL *curl = curl_easy_init(); CURLcode res; curl_easy_setopt( curl, CURLOPT_URL, url); curl_easy_setopt( curl, CURLOPT_WRITEFUNCTION, write_callback ); curl_easy_setopt( curl, CURLOPT_WRITEDATA, write_tag ); res = curl_easy_perform(curl);
Nothing fancy, just a basic curl connection. Client hangs in tranfer.c (in libcurl) because the socket is not perceived as being closed. It's waiting for more data from the server.
Things I've tried so far:
Shutdown before close
shutdown(fd, SHUT_WR); char buf[64]; while(read(fd, buf, 64) > 0); /* then close */
Setting SO_LINGER to close forcibly in 1 second
struct linger l; l.l_onoff = 1; l.l_linger = 1; if (setsockopt(client_socket, SOL_SOCKET, SO_LINGER, &l, sizeof(l)) == -1) abort();
These have made no difference. Any ideas would be greatly appreciated.
EDIT -- This ended up being a thread-safety issue inside a queue library causing the socket to be handled inappropriately by multiple threads.
close() call shuts down the socket associated with the socket descriptor socket, and frees resources allocated to the socket. If socket refers to an open TCP connection, the connection is closed. If a stream socket is closed when there is input data queued, the TCP connection is reset rather than being cleanly closed.
One way or another, if you don't close a socket, your program will leak a file descriptor. Programs can usually only open a limited number of file descriptors, so if this happens a lot, it may turn into a problem.
Strictly speaking, you're supposed to use shutdown on a socket before you close it. The shutdown is an advisory to the socket at the other end. Depending on the argument you pass it, it can mean “I'm not going to send anymore, but I'll still listen”, or “I'm not listening, good riddance!”.
Here is some code I've used on many Unix-like systems (e.g SunOS 4, SGI IRIX, HPUX 10.20, CentOS 5, Cygwin) to close a socket:
int getSO_ERROR(int fd) { int err = 1; socklen_t len = sizeof err; if (-1 == getsockopt(fd, SOL_SOCKET, SO_ERROR, (char *)&err, &len)) FatalError("getSO_ERROR"); if (err) errno = err; // set errno to the socket SO_ERROR return err; } void closeSocket(int fd) { // *not* the Windows closesocket() if (fd >= 0) { getSO_ERROR(fd); // first clear any errors, which can cause close to fail if (shutdown(fd, SHUT_RDWR) < 0) // secondly, terminate the 'reliable' delivery if (errno != ENOTCONN && errno != EINVAL) // SGI causes EINVAL Perror("shutdown"); if (close(fd) < 0) // finally call close() Perror("close"); } }
But the above does not guarantee that any buffered writes are sent.
Graceful close: It took me about 10 years to figure out how to close a socket. But for another 10 years I just lazily called usleep(20000)
for a slight delay to 'ensure' that the write buffer was flushed before the close. This obviously is not very clever, because:
usleep()
(but I usually called usleep()
twice to handle this case--a hack).But doing a proper flush is surprisingly hard. Using SO_LINGER
is apparently not the way to go; see for example:
And SIOCOUTQ
appears to be Linux-specific.
Note shutdown(fd, SHUT_WR)
doesn't stop writing, contrary to its name, and maybe contrary to man 2 shutdown
.
This code flushSocketBeforeClose()
waits until a read of zero bytes, or until the timer expires. The function haveInput()
is a simple wrapper for select(2), and is set to block for up to 1/100th of a second.
bool haveInput(int fd, double timeout) { int status; fd_set fds; struct timeval tv; FD_ZERO(&fds); FD_SET(fd, &fds); tv.tv_sec = (long)timeout; // cast needed for C++ tv.tv_usec = (long)((timeout - tv.tv_sec) * 1000000); // 'suseconds_t' while (1) { if (!(status = select(fd + 1, &fds, 0, 0, &tv))) return FALSE; else if (status > 0 && FD_ISSET(fd, &fds)) return TRUE; else if (status > 0) FatalError("I am confused"); else if (errno != EINTR) FatalError("select"); // tbd EBADF: man page "an error has occurred" } } bool flushSocketBeforeClose(int fd, double timeout) { const double start = getWallTimeEpoch(); char discard[99]; ASSERT(SHUT_WR == 1); if (shutdown(fd, 1) != -1) while (getWallTimeEpoch() < start + timeout) while (haveInput(fd, 0.01)) // can block for 0.01 secs if (!read(fd, discard, sizeof discard)) return TRUE; // success! return FALSE; }
Example of use:
if (!flushSocketBeforeClose(fd, 2.0)) // can block for 2s printf("Warning: Cannot gracefully close socket\n"); closeSocket(fd);
In the above, my getWallTimeEpoch()
is similar to time(),
and Perror()
is a wrapper for perror().
Edit: Some comments:
My first admission is a bit embarrassing. The OP and Nemo challenged the need to clear the internal so_error
before close, but I cannot now find any reference for this. The system in question was HPUX 10.20. After a failed connect()
, just calling close()
did not release the file descriptor, because the system wished to deliver an outstanding error to me. But I, like most people, never bothered to check the return value of close.
So I eventually ran out of file descriptors (ulimit -n),
which finally got my attention.
(very minor point) One commentator objected to the hard-coded numerical arguments to shutdown()
, rather than e.g. SHUT_WR for 1. The simplest answer is that Windows uses different #defines/enums e.g. SD_SEND
. And many other writers (e.g. Beej) use constants, as do many legacy systems.
Also, I always, always, set FD_CLOEXEC on all my sockets, since in my applications I never want them passed to a child and, more importantly, I don't want a hung child to impact me.
Sample code to set CLOEXEC:
static void setFD_CLOEXEC(int fd) { int status = fcntl(fd, F_GETFD, 0); if (status >= 0) status = fcntl(fd, F_SETFD, status | FD_CLOEXEC); if (status < 0) Perror("Error getting/setting socket FD_CLOEXEC flags"); }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With