Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does my Perl TCP server script hang with many TCP connections?

I've got a strange issue with a server accepting TCP connections. Even though there are normally some processes waiting, at some volume of connections it hangs.

Long version:

The server is written in Perl and binds a $srv socket with the reuse flag and listen == 5. Afterwards, it forks into 10 processes with a loop of $clt=$srv->accept(); do_processing($clt); $clt->shutdown(2);

The client written in C is also very simple - it sends some lines, then receives all lines available and does a shutdown(sockfd, 2); There's nothing async going on and at the end both send and receive queues are empty (as reported by netstat).

Connections last only ~20ms. All clients behave the same way, are the same implementation, etc. Now let's say I'm accepting X connections from client 1 and another X from client 2. Processes still report that they're idle all the time. If I add another X connections from client 3, suddenly the server processes start hanging just after accepting. The first blocking thing they do after accept(); is while (<$clt>) ... - but they don't get any data (on the first try already). Suddenly all 10 processes are in this state and do not stop waiting. On strace, the server processes seem to hang on read(), which makes sense.

There are loads of connections in TIME_WAIT state belonging to that server (~100 when the problem starts to manifest), but this might be a red herring.

What could be happening here?


After some more analysis: It turned out that the client was at fault, not closing previous connections properly before trying the next one. The servers at the beginning of the load-balancing list were left stale connections.

like image 235
viraptor Avatar asked Apr 19 '10 17:04

viraptor


1 Answers

This probably isn't the solution to your problem, but it might solve a problem you'll experience in the future: don't forget to close() the sockets when you're done! shutdown() will disconnect the stream, but it'll still eat a file descriptor.

Since you said strace is showing processes stuck in read(), then your problem seems to be that the client isn't sending the data you expect it to be sending. You should either fix your client, or add an alarm() to your server processes so that they can survive dead clients.

like image 166
apenwarr Avatar answered Oct 25 '22 11:10

apenwarr