Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dropping of connections with tcp_tw_recycle

Tags:

linux

tcp-ip

summary of the problem

we are having a setup wherein a lot(800 to 2400 per second( of incoming connections to a linux box and we have a NAT device between the client and server. so there are so many TIME_WAIT sockets left in the system. To overcome that we had set tcp_tw_recycle to 1, but that led to drop of in comming connections. after browsing through the net we did find the references for why the dropping of frames with tcp_tw_recycle and NAT device happens.

resolution tried

we then tried by setting tcp_tw_reuse to 1 it worked fine without any issues with the same setup and configuration.

But the documentation says that tcp_tw_recycle and tcp_tw_reuse should not be used when the Connections that go through TCP state aware nodes, such as firewalls, NAT devices or load balancers may see dropped frames. The more connections there are, the more likely you will see this issue.

Queries

1) can tcp_tw_reuse be used in this type of scenarios? 2) if not, which part of the linux code is preventing tcp_tw_reuse being used for such scenario? 3) generally what is the difference between tcp_tw_recycle and tcp_tw_reuse?

like image 520
user1153755 Avatar asked Jan 17 '12 11:01

user1153755


People also ask

What is Tcp_tw_recycle?

TCP_TW_RECYCLE uses the same server-side time-stamps, however it affects both inbound and outbound connections. This is useful when the server is the first party to initiate connection closure. This allows a new client inbound connection from the source IP to the server.

What is net ipv4 tcp_tw_reuse?

tcp_tw_reuse. Permits sockets in the time-wait state to be reused for new connections. In high traffic environments, sockets are created and destroyed at very high rates. This parameter, when set, allows no longer needed and about to be destroyed sockets to be used for new connections.

How do I fix TCP connection timeout in Linux?

Some time it is necessary to increase or decrease timeouts on TCP sockets. You can use /proc/sys/net/ipv4/tcp_keepalive_time to setup new value. The number of seconds a connection needs to be idle before TCP begins sending out keep-alive probes. Keep-alives are only sent when the SO_KEEPALIVE socket option is enabled.


1 Answers

By default, when both tcp_tw_reuse and tcp_tw_recycle are disabled, the kernel will make sure that sockets in TIME_WAIT state will remain in that state long enough -- long enough to be sure that packets belonging to future connections will not be mistaken for late packets of the old connection.

When you enable tcp_tw_reuse, sockets in TIME_WAIT state can be used before they expire, and the kernel will try to make sure that there is no collision regarding TCP sequence numbers. If you enable tcp_timestamps (a.k.a. PAWS, for Protection Against Wrapped Sequence Numbers), it will make sure that those collisions cannot happen. However, you need TCP timestamps to be enabled on both ends (at least, that's my understanding). See the definition of tcp_twsk_unique for the gory details.

When you enable tcp_tw_recycle, the kernel becomes much more aggressive, and will make assumptions on the timestamps used by remote hosts. It will track the last timestamp used by each remote host having a connection in TIME_WAIT state), and allow to re-use a socket if the timestamp has correctly increased. However, if the timestamp used by the host changes (i.e. warps back in time), the SYN packet will be silently dropped, and the connection won't establish (you will see an error similar to "connect timeout"). If you want to dive into kernel code, the definition of tcp_timewait_state_process might be a good starting point.

Now, timestamps should never go back in time; unless:

  • the host is rebooted (but then, by the time it comes back up, TIME_WAIT socket will probably have expired, so it will be a non issue);
  • the IP address is quickly reused by something else (TIME_WAIT connections will stay a bit, but other connections will probably be struck by TCP RST and that will free up some space);
  • network address translation (or a smarty-pants firewall) is involved in the middle of the connection.

In the latter case, you can have multiple hosts behind the same IP address, and therefore, different sequences of timestamps (or, said timestamps are randomized at each connection by the firewall). In that case, some hosts will be randomly unable to connect, because they are mapped to a port for which the TIME_WAIT bucket of the server has a newer timestamp. That's why the docs tell you that "NAT devices or load balancers may start drop frames because of the setting".

Some people recommend to leave tcp_tw_recycle alone, but enable tcp_tw_reuse and lower tcp_fin_timeout. I concur :-)

like image 172
jpetazzo Avatar answered Sep 21 '22 20:09

jpetazzo