If too many sockets are in TIME_WAIT you will find it difficult to establish new outbound connections due to there being a lack of local ports that can be used for the new connections.
TIME_WAIT means it's waiting for a reply or connection. this often happens when a port is activated and the connection has not yet. been established.
TCP TIME_WAIT is a normal TCP protocol operation, it means after delivering the last FIN-ACK, client side will wait for double maximum segment life (MSL) Time to pass to be sure the remote TCP received the acknowledgement of its connection termination request. By default, MSL is 2 minutes.
There are two reasons for the TIME_WAIT state: To implement TCP's full-duplex connection termination reliably. To allow old duplicate segments to expire in the network.
Each socket in TIME_WAIT
consumes some memory in the kernel, usually somewhat less than an ESTABLISHED
socket yet still significant. A sufficiently large number could exhaust kernel memory, or at least degrade performance because that memory could be used for other purposes. TIME_WAIT
sockets do not hold open file descriptors (assuming they have been closed properly), so you should not need to worry about a "too many open files" error.
The socket also ties up that particular src
/dst
IP address and port so it cannot be reused for the duration of the TIME_WAIT
interval. (This is the intended purpose of the TIME_WAIT
state.) Tying up the port is not usually an issue unless you need to reconnect a with the same port pair. Most often one side will use an ephemeral port, with only one side anchored to a well known port. However, a very large number of TIME_WAIT
sockets can exhaust the ephemeral port space if you are repeatedly and frequently connecting between the same two IP addresses. Note this only affects this particular IP address pair, and will not affect establishment of connections with other hosts.
Each connection is identified by a tuple (server IP, server port, client IP, client port). Crucially, the TIME_WAIT
connections (whether they are on the server side or on the client side) each occupy one of these tuples.
With the TIME_WAIT
s on the client side, it's easy to see why you can't make any more connections - you have no more local ports. However, the same issue applies on the server side - once it has 64k connections in TIME_WAIT
state for a single client, it can't accept any more connections from that client, because it has no way to tell the difference between the old connection and the new connection - both connections are identified by the same tuple. The server should just send back RST
s to new connection attempts from that client in this case.
Findings so far:
Even if the server closed the socket using system call, its file descriptor will not be released if it enters the TIME_WAIT state. The file descriptor will be released later when the TIME_WAIT state is gone (i.e. after 2*MSL seconds). Therefore, too many TIME_WAITs will possibly lead to 'too many open files' error in the server process.
I believe OS TCP/IP stack has been implemented with proper data structure (e.g. hash table), so the total number of TIME_WAITs should not affect the performance of the OS TCP/IP stack. Only the process (server) which owns the sockets in TIME_WAIT state will suffer.
If you have a lot of connections from many different client IPs to the server IPs you might run into limitations of the connection tracking table.
Check:
sysctl net.ipv4.netfilter.ip_conntrack_count
sysctl net.ipv4.netfilter.ip_conntrack_max
Over all src ip/port and dest ip/port tuples you can only have net.ipv4.netfilter.ip_conntrack_max in the tracking table. If this limit is hit you will see a message in your logs "nf_conntrack: table full, dropping packet." and the server will not accept new incoming connections until there is space in the tracking table again.
This limitation might hit you long before the ephemeral ports run out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With