We have haproxy 1.3.26 hosted on CentOS 5.9 machine having 2.13 GHz Intel Xeon processor which is acting as a http & tcp load balancer for numerous services, serving a peak throughput of ~2000 requests/second. It has been running fine for 2 years but gradually both traffic and number of services are increasing.
Off late we've observed that even after reload old haproxy process remains. On further investigation we found that old process has numerous connections in TIME_WAIT state. We also saw that netstat
and lsof
were taking a long long time. On referring http://agiletesting.blogspot.in/2013/07/the-mystery-of-stale-haproxy-processes.html we introduced option forceclose
but it was messing up with various monitoring service hence reverted it. On further digging we realised that in /proc/net/sockstat
close to 200K sockets are in tw
(TIME_WAIT
) state which is surprising as in /etc/haproxy/haproxy.cfg
maxconn
has been specified as 31000 and ulimit-n
as 64000. We had timeout server
and timeout client
as 300s
which we changed to 30s
but not much use.
Now the doubts are :-
Note: The quotes in this answer are all from a mail by Willy Tarreau (the main author of HAProxy) to the HAProxy mailinglist.
Connections in TIME_WAIT
state are harmless and don't really consume any resources anymore. They are kept by the kernel on a server for some time for the rare event that it still receives a package after the connection was closed. The default time a closed connection is held in that state is typically 120 seconds (or 2 times the maximum segment lifetime)
TIME_WAIT are harmless on the server side. You can easily reach millions without any issues.
If you still want to reduce that number to release connections earlier, you can instruct the kernel to do so. To e.g. set it to 30 seconds execute this:
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
If you have many connections (either in TIME_WAIT or not), netstat
, lsof
, ipcs
perform very poorly and actually slow the whole system down. To quote Willy again:
There are two commands that you must absolutely never use in a monitoring system :
netstat -a
ipcs -a
Both of them will saturate the system and considerably slow it down when something starts to go wrong. For the sockets you should use what's in
/proc/net/sockstat
. You have all the numbers you want. If you need more details, usess -a
instead ofnetstat -a
, it uses the netlink interface and is several orders of magnitude faster.
On Debian and Ubuntu systems, ss
is available in the iproute
or iproute2
package (depending on the version of your distribution).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With