I am experimenting with TCP keep alive on my Linux box, and have written the following small server:
#include <iostream>
#include <cstring>
#include <netinet/in.h>
#include <arpa/inet.h> // inet_ntop
#include <netinet/tcp.h>
#include <netdb.h> // addrinfo stuff
using namespace std;
typedef int SOCKET;
int main(int argc, char *argv [])
{
struct sockaddr_in sockaddr_IPv4;
memset(&sockaddr_IPv4, 0, sizeof(struct sockaddr_in));
sockaddr_IPv4.sin_family = AF_INET;
sockaddr_IPv4.sin_port = htons(58080);
if (inet_pton(AF_INET, "10.6.186.24", &sockaddr_IPv4.sin_addr) != 1)
return -1;
SOCKET serverSock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (bind(serverSock, (sockaddr*)&sockaddr_IPv4, sizeof(sockaddr_IPv4)) != 0 || listen(serverSock, SOMAXCONN) != 0)
{
cout << "Failed to setup listening socket!\n";
}
SOCKET clientSock = accept(serverSock, 0, 0);
if (clientSock == -1)
return -1;
// Enable keep-alive on the client socket
const int nVal = 1;
if (setsockopt(clientSock, SOL_SOCKET, SO_KEEPALIVE, &nVal, sizeof(nVal)) < 0)
{
cout << "Failed to set keep-alive!\n";
return -1;
}
// Get the keep-alive options that will be used on the client socket
int nProbes, nTime, nInterval;
socklen_t nOptLen = sizeof(int);
bool bError = false;
if (getsockopt(clientSock, IPPROTO_TCP, TCP_KEEPIDLE, &nTime, &nOptLen) < 0) { bError = true; }
nOptLen = sizeof(int);
if (getsockopt(clientSock, IPPROTO_TCP, TCP_KEEPCNT, &nProbes, &nOptLen) < 0) {bError = true; }
nOptLen = sizeof(int);
if (getsockopt(clientSock, IPPROTO_TCP, TCP_KEEPINTVL, &nInterval, &nOptLen) < 0) { bError = true; }
cout << "Keep alive settings are: time: " << nTime << ", interval: " << nInterval << ", number of probes: " << nProbes << "\n";
if (bError)
{
// Failed to retrieve values
cout << "Failed to get keep-alive options!\n";
return -1;
}
int nRead = 0;
char buf[128];
do
{
nRead = recv(clientSock, buf, 128, 0);
} while (nRead != 0);
return 0;
}
I then adjusted the system-wide TCP keep alive settings to be as follows:
# cat /proc/sys/net/ipv4/tcp_keepalive_time
20
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
30
I then connected to my server from Windows, and ran a Wireshark trace to see the keep-alive packets. The image below shows the result.
This confused me, since I now understand the keep-alive interval to only come into play if no ACK is received in response to the original keep alive packet (see my other question here). So I would expect the subsequent packets to be consistently sent at 20 second intervals (not 30, which is what we see), not just the first one.
I then adjusted the system wide settings as follows:
# cat /proc/sys/net/ipv4/tcp_keepalive_time
30
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
20
This time when I connect, I see the following in my Wireshark trace:
Now we see that the first keep-alive packet is sent after 30 seconds, but each one thereafter is also sent at 30 seconds, not the 20 as would be suggested by the previous run!
Can someone please explain this inconsistent behaviour?
All modern browsers use persistent connections as long as the server has Keep-Alive enabled. In order to check if your pages are delivered with a Keep-Alive header, you can use the HTTP Header Checker tool. This will display the Connection: Keep-Alive field if the HTTP Keep-Alive header is enabled.
A TCP Keep-Alive, originally defined in Request for Comments (RFC) 1122, is an empty TCP segment intended to cause the peer to send an ACK. The default is 1800 seconds. Note: For more information about TCP keep alive, refer to the Internet Engineering Task Force (RFC 1122).
What is TCP keepalive? The keepalive concept is very simple: when you set up a TCP connection, you associate a set of timers. Some of these timers deal with the keepalive procedure. When the keepalive timer reaches zero, you send your peer a keepalive probe packet with no data in it and the ACK flag turned on.
The Benefits of Connection Keep Alive Establishing a TCP connection first requires a three-way handshake – a mutual exchange of SYN and ACK packets between a client and server before data can be transmitted. Using the keep-alive header means not having to constantly perform this process.
Roughly speaking, how it is supposed to work is that a keepalive message will be sent every tcp_keepalive_time
seconds. If an ACK
is not recieved, it will then probe every tcp_keepalive_intvl
seconds. If an ACK
is not received after tcp_keepalive_probes
, the connection will be aborted. Thus, a connection will be aborted after at most
tcp_keepalive_time + tcp_keepalive_probes * tcp_keepalive_intvl
seconds without a response. See this kernel documentation.
We can easily watch this work using netcat keepalive, a version of netcat that allows us to set tcp keepalive parameters (The sysctl keepalive parameters are the default, but they can be overriden on a per socket basis in the tcp_sock
struct).
First start up a server listening on port 8888
with keepalive_timer
set to 5 seconds, keepalive_intval
set to 1 second, and keepalive_probes
set to 4.
$ ./nckl-linux -K -O 5 -I 1 -P 4 -l 8888 >/dev/null &
Next, let's use iptables
to introduce loss for ACK
packets sent to the server:
$ sudo iptables -A OUTPUT -p tcp --dport 8888 \
> --tcp-flags SYN,ACK,RST,FIN ACK \
> -m statistic --mode random --probability 0.5 \
> -j DROP
This will cause packets that are sent to TCP port 8888 with just the ACK
flag set to be dropped with probability 0.5.
Now let's connect and watch with the vanilla netcat (which will use the sysctl keepalive values):
$ nc localhost 8888
Here is the capture:
As you can see, it waits 5 seconds after receiving an ACK
before sending another keepalive message. If it doesn't receive an ACK
within 1 second, it sends another probe, and if it doesn't receive an ACK
after 4 probes, it aborts the connection. This is exactly how keepalive is supposed to work.
So let's try to reproduce what you were seeing. Let's delete the iptables rule (no loss), start a new server with tcp_keepalive_time
set to 1 second, and tcp_keepalive_intvl
set to 5 seconds, and then connect with a client. Here is the result:
Interestingly, we see the same behavior you did: after the first ACK
, it waits 1 second to send a keepalive message, and thereafter every 5 seconds.
Let's add the iptables rule back in to introduce loss to see what time it actually waits to send another probe if it doesn't get an ACK
(using -K -O 1 -I 5 -P 4
on the server):
Again, it waits 1 second from the first ACK
to send a keepalive message, but thereafter it waits 5 seconds whether it sees an ACK
or not, as if keepalive_time
and keepalive_intvl
are both set to 5.
In order to understand this behavior, we will need to take a look at the linux kernel TCP implementation. Let's first look at tcp_finish_connect
:
if (sock_flag(sk, SOCK_KEEPOPEN))
inet_csk_reset_keepalive_timer(sk, keepalive_time_when(tp));
When the TCP connection is established, the keepalive timer is effectively set to tcp_keepalive_time
, which is 1 second in our case.
Next, let's take a look at how the timer is processed in tcp_keepalive_timer
:
elapsed = keepalive_time_elapsed(tp);
if (elapsed >= keepalive_time_when(tp)) {
/* If the TCP_USER_TIMEOUT option is enabled, use that
* to determine when to timeout instead.
*/
if ((icsk->icsk_user_timeout != 0 &&
elapsed >= icsk->icsk_user_timeout &&
icsk->icsk_probes_out > 0) ||
(icsk->icsk_user_timeout == 0 &&
icsk->icsk_probes_out >= keepalive_probes(tp))) {
tcp_send_active_reset(sk, GFP_ATOMIC);
tcp_write_err(sk);
goto out;
}
if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
icsk->icsk_probes_out++;
elapsed = keepalive_intvl_when(tp);
} else {
/* If keepalive was lost due to local congestion,
* try harder.
*/
elapsed = TCP_RESOURCE_PROBE_INTERVAL;
}
} else {
/* It is tp->rcv_tstamp + keepalive_time_when(tp) */
elapsed = keepalive_time_when(tp) - elapsed;
}
sk_mem_reclaim(sk);
resched:
inet_csk_reset_keepalive_timer (sk, elapsed);
goto out;
When keepalive_time_when
is greater than keepalive_itvl_when
this code works as expected. However, when it is not, you see the behavior you observed.
When the initial timer (set when the TCP connection is established) expires after 1 second, we will extend the timer until elapsed
is greater than keepalive_time_when
. At that point we will send a probe, and will set the timer to keepalive_intvl_when
, which is 5 seconds. When this timer expires, if nothing has been received for the last 1 second (keepalive_time_when
), we will send a probe, and then set the timer again to keepalive_intvl_when
, and wake up in another 5 seconds, and so on.
However, if we have received something within keepalive_time_when
when the timer expires, it will use keepalive_time_when
to reschedule the timer for 1 second since the last time we received anything.
So, to answer your question, the linux implementation of TCP keepalive assumes that keepalive_intvl
is less than keepalive_time
, but nevertheless works "sensibly."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With