We are experiencing a problem where our incoming client socket connections to our socket server are being denied when a relatively small number of nodes (16 to 24, but we will need to handle more in the future) are trying to connect simultaneously.
Some specifics:
When we try and do a test run on the grid, the client nodes attempt to connect to the server and send a 40-100K packet and then drop the connection. Using between 16 and 24 nodes we start seeing problems with client connections failing to be able to connect to the server. Given this setup, we are trying to potentially handle a max of 16-24 simultaneous client connections and failing, which does not seem right to us at all.
The main server loop is listening on a regular SocketServer and when it gets a connection it spawns a new Thread to handle the connection, returning immediately to listen on the socket. We also have a dummy python server that simply reads and discards the incoming data and a C++ server that logs the data before dumping it and both are also experiencing the same problem with clients being unable to connect with minor variations in how many successful client connections before the failures start. This has lead us to believe that any specific server is not at fault in this issue and that it is probably environmental.
Our first thoughts were to up the TCP backlog on the socket. This did not alleviate the issue even when pushed to very high levels. The default for a Java SocketServer is 50, much lower than we are able to handle.
We have run the test between machines on the same subnet, and disabled all local firewalls on the machines in case the FW is doing rate limiting our connections to the server; no success.
We have tried some tuning of the network on the Windows machine running the servers:
My feeling is that Windows is somehow limiting the number of inbound connections but we aren't sure what to modify to allow a greater number of connections. The thoughts of an agent on the network limiting the connection rate also don't seem to be true. We highly doubt that the number of simultaneous connections is overloading the physical GB network.
We're stumped. Has anybody else experienced a problem like this and found a solution?
Maximum number of sockets. For most socket interfaces, the maximum number of sockets allowed per each connection between an application and the TCP/IP sockets interface is 65535.
For Outgoing connection there is limit imposed by ephemeral ports - number of ports that can be at once opened. Following RFC those ports may be used 49152-65535, therefore 16383 outgoing connection can be supported. In both windows and Linux this can be adjusted.
A socket that has been established as a server can accept connection requests from multiple clients.
What is the maximum number of concurrent TCP connections that a server can handle, in theory ? A single listening port can accept more than one connection simultaneously. There is a '64K' limit that is often cited, but that is per client per server port, and needs clarifying.
I would check how many connections are in the TIME_WAIT state of the TCP connection. I have seen this type of problem due to many connections being open/closed causing socket exhaustion due to the TIME_WAIT. To check it, run:
netstat -a
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With