Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TCP keep-alive parameters not being honoured

I am experimenting with TCP keep alive on my Linux box, and have written the following small server:

#include <iostream>
#include <cstring>

#include <netinet/in.h>
#include <arpa/inet.h>  // inet_ntop
#include <netinet/tcp.h>
#include <netdb.h>          // addrinfo stuff

using namespace std;

typedef int SOCKET;

int main(int argc, char *argv []) 
{
    struct sockaddr_in sockaddr_IPv4;
    memset(&sockaddr_IPv4, 0, sizeof(struct sockaddr_in));
    sockaddr_IPv4.sin_family = AF_INET;
    sockaddr_IPv4.sin_port = htons(58080);

    if (inet_pton(AF_INET, "10.6.186.24", &sockaddr_IPv4.sin_addr) != 1)
        return -1;

    SOCKET serverSock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);

    if (bind(serverSock, (sockaddr*)&sockaddr_IPv4, sizeof(sockaddr_IPv4)) != 0 || listen(serverSock, SOMAXCONN) != 0) 
    { 
        cout << "Failed to setup listening socket!\n";
    }

    SOCKET clientSock = accept(serverSock, 0, 0);
    if (clientSock == -1) 
        return -1;

    // Enable keep-alive on the client socket
    const int nVal = 1;
    if (setsockopt(clientSock, SOL_SOCKET, SO_KEEPALIVE, &nVal, sizeof(nVal)) < 0)
    {
        cout << "Failed to set keep-alive!\n";
        return -1;
    }

    // Get the keep-alive options that will be used on the client socket

    int nProbes, nTime, nInterval;
    socklen_t nOptLen = sizeof(int);
    bool bError = false;

    if (getsockopt(clientSock, IPPROTO_TCP, TCP_KEEPIDLE, &nTime, &nOptLen) < 0) { bError = true; }
    nOptLen = sizeof(int);

    if (getsockopt(clientSock, IPPROTO_TCP, TCP_KEEPCNT, &nProbes, &nOptLen) < 0) {bError = true; }
    nOptLen = sizeof(int);

    if (getsockopt(clientSock, IPPROTO_TCP, TCP_KEEPINTVL, &nInterval, &nOptLen) < 0) { bError = true; }

    cout << "Keep alive settings are: time: " << nTime << ", interval: " << nInterval << ", number of probes: " << nProbes << "\n";

    if (bError) 
    {
        // Failed to retrieve values
        cout << "Failed to get keep-alive options!\n";
        return -1;
    }

    int nRead = 0;
    char buf[128];
    do 
    {
        nRead = recv(clientSock, buf, 128, 0);
    } while (nRead != 0);


    return 0;
}

I then adjusted the system-wide TCP keep alive settings to be as follows:

# cat /proc/sys/net/ipv4/tcp_keepalive_time
20
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
30

I then connected to my server from Windows, and ran a Wireshark trace to see the keep-alive packets. The image below shows the result.

Packets 1

This confused me, since I now understand the keep-alive interval to only come into play if no ACK is received in response to the original keep alive packet (see my other question here). So I would expect the subsequent packets to be consistently sent at 20 second intervals (not 30, which is what we see), not just the first one.

I then adjusted the system wide settings as follows:

# cat /proc/sys/net/ipv4/tcp_keepalive_time
30
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
20

This time when I connect, I see the following in my Wireshark trace:

Packets2

Now we see that the first keep-alive packet is sent after 30 seconds, but each one thereafter is also sent at 30 seconds, not the 20 as would be suggested by the previous run!

Can someone please explain this inconsistent behaviour?

like image 667
Wad Avatar asked Mar 15 '17 17:03

Wad


People also ask

How do you check if Keep-Alive is working?

All modern browsers use persistent connections as long as the server has Keep-Alive enabled. In order to check if your pages are delivered with a Keep-Alive header, you can use the HTTP Header Checker tool. This will display the Connection: Keep-Alive field if the HTTP Keep-Alive header is enabled.

How long you can keep a TCP connection alive?

A TCP Keep-Alive, originally defined in Request for Comments (RFC) 1122, is an empty TCP segment intended to cause the peer to send an ACK. The default is 1800 seconds. Note: For more information about TCP keep alive, refer to the Internet Engineering Task Force (RFC 1122).

How do you keep a TCP connection alive?

What is TCP keepalive? The keepalive concept is very simple: when you set up a TCP connection, you associate a set of timers. Some of these timers deal with the keepalive procedure. When the keepalive timer reaches zero, you send your peer a keepalive probe packet with no data in it and the ACK flag turned on.

Should I keep TCP connection alive?

The Benefits of Connection Keep Alive Establishing a TCP connection first requires a three-way handshake – a mutual exchange of SYN and ACK packets between a client and server before data can be transmitted. Using the keep-alive header means not having to constantly perform this process.


1 Answers

Roughly speaking, how it is supposed to work is that a keepalive message will be sent every tcp_keepalive_time seconds. If an ACK is not recieved, it will then probe every tcp_keepalive_intvl seconds. If an ACK is not received after tcp_keepalive_probes, the connection will be aborted. Thus, a connection will be aborted after at most

    tcp_keepalive_time + tcp_keepalive_probes * tcp_keepalive_intvl

seconds without a response. See this kernel documentation.

We can easily watch this work using netcat keepalive, a version of netcat that allows us to set tcp keepalive parameters (The sysctl keepalive parameters are the default, but they can be overriden on a per socket basis in the tcp_sock struct).

First start up a server listening on port 8888 with keepalive_timer set to 5 seconds, keepalive_intval set to 1 second, and keepalive_probes set to 4.

    $ ./nckl-linux -K -O 5 -I 1 -P 4 -l 8888 >/dev/null &

Next, let's use iptables to introduce loss for ACK packets sent to the server:

    $ sudo iptables -A OUTPUT -p tcp --dport 8888 \
    >   --tcp-flags SYN,ACK,RST,FIN ACK \
    >   -m statistic --mode random --probability 0.5 \
    >   -j DROP

This will cause packets that are sent to TCP port 8888 with just the ACK flag set to be dropped with probability 0.5.

Now let's connect and watch with the vanilla netcat (which will use the sysctl keepalive values):

    $ nc localhost 8888

Here is the capture:

TCP keepalive capture

As you can see, it waits 5 seconds after receiving an ACK before sending another keepalive message. If it doesn't receive an ACK within 1 second, it sends another probe, and if it doesn't receive an ACK after 4 probes, it aborts the connection. This is exactly how keepalive is supposed to work.

So let's try to reproduce what you were seeing. Let's delete the iptables rule (no loss), start a new server with tcp_keepalive_time set to 1 second, and tcp_keepalive_intvl set to 5 seconds, and then connect with a client. Here is the result:

Capture with keepalive_time < keepalive_intvl, no loss

Interestingly, we see the same behavior you did: after the first ACK, it waits 1 second to send a keepalive message, and thereafter every 5 seconds.

Let's add the iptables rule back in to introduce loss to see what time it actually waits to send another probe if it doesn't get an ACK (using -K -O 1 -I 5 -P 4 on the server):

Capture with keepalive_time < keepalive_intvl, with loss

Again, it waits 1 second from the first ACK to send a keepalive message, but thereafter it waits 5 seconds whether it sees an ACK or not, as if keepalive_time and keepalive_intvl are both set to 5.

In order to understand this behavior, we will need to take a look at the linux kernel TCP implementation. Let's first look at tcp_finish_connect:

 if (sock_flag(sk, SOCK_KEEPOPEN))
        inet_csk_reset_keepalive_timer(sk, keepalive_time_when(tp));

When the TCP connection is established, the keepalive timer is effectively set to tcp_keepalive_time, which is 1 second in our case.

Next, let's take a look at how the timer is processed in tcp_keepalive_timer:

  elapsed = keepalive_time_elapsed(tp);

  if (elapsed >= keepalive_time_when(tp)) {
          /* If the TCP_USER_TIMEOUT option is enabled, use that
           * to determine when to timeout instead.
           */
          if ((icsk->icsk_user_timeout != 0 &&
              elapsed >= icsk->icsk_user_timeout &&
              icsk->icsk_probes_out > 0) ||
              (icsk->icsk_user_timeout == 0 &&
              icsk->icsk_probes_out >= keepalive_probes(tp))) {
                  tcp_send_active_reset(sk, GFP_ATOMIC);
                  tcp_write_err(sk);
                  goto out;
          }
          if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
                  icsk->icsk_probes_out++;
                  elapsed = keepalive_intvl_when(tp);
          } else {
                  /* If keepalive was lost due to local congestion,
                   * try harder.
                   */
                  elapsed = TCP_RESOURCE_PROBE_INTERVAL;
          }
  } else {
          /* It is tp->rcv_tstamp + keepalive_time_when(tp) */
          elapsed = keepalive_time_when(tp) - elapsed;
  }

  sk_mem_reclaim(sk);

resched:
  inet_csk_reset_keepalive_timer (sk, elapsed);
  goto out;

When keepalive_time_when is greater than keepalive_itvl_when this code works as expected. However, when it is not, you see the behavior you observed.

When the initial timer (set when the TCP connection is established) expires after 1 second, we will extend the timer until elapsed is greater than keepalive_time_when. At that point we will send a probe, and will set the timer to keepalive_intvl_when, which is 5 seconds. When this timer expires, if nothing has been received for the last 1 second (keepalive_time_when), we will send a probe, and then set the timer again to keepalive_intvl_when, and wake up in another 5 seconds, and so on.

However, if we have received something within keepalive_time_when when the timer expires, it will use keepalive_time_when to reschedule the timer for 1 second since the last time we received anything.

So, to answer your question, the linux implementation of TCP keepalive assumes that keepalive_intvl is less than keepalive_time, but nevertheless works "sensibly."

like image 174
Jim D. Avatar answered Sep 20 '22 14:09

Jim D.