Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unusually slow TCP-connection in Linux

I wrote user-mode client-server c application based on Berkeley sockets that interact over some private network.
The situation is definitely strange. Occasionally the connection becomes very slow under some vague circumstances. The normal TCP data exchange in my case is about 10-25 Kbytes payload per segment, but sometimes it becomes about ~200-500 bytes per segment.

After some troubleshooting, I realized that this problem is not reproducible for other network services, thus it looks like my service is to blame. But I can't figure out, what's wrong. It worked well on 3.10 Linux kernel, but have that strange behavior on 4.4. Could it be some internal kernel changes which caused such problem?

I tried to play with Linux sysctl settings:

net.ipv4.tcp_congestion_control
net.ipv4.tcp_sack
net.ipv4.route.flush

but that did not help.

Seems that the problem appears at listen socket side. In tcpdump the TCP Window size is OK while handshaking. But after first incoming packet window size reduces (by listener's side).

UPD
Here is my server-side code snippet:

 serv_fd = socket(AF_INET, SOCK_STREAM, 0); 
 if (serv_fd == -1) {
      perror("socket");
      return;
 }   

 server.sin_family = AF_INET;
 server.sin_port = htons(LISTEN_PORT);
 server.sin_addr.s_addr = htonl(INADDR_ANY);

 #ifdef SET_BUF
 if (setsockopt(serv_fd, SOL_SOCKET, SO_RCVBUF, &buflen, sizeof(int)) == -1) {
      perror ("setsockopt");
      return;
 }   
 if (setsockopt(serv_fd, SOL_SOCKET, SO_SNDBUF, &buflen, sizeof(int)) == -1) {
      perror ("setsockopt");
      return;
 }   
 #endif // SET_BUF

 if (bind(serv_fd, (struct sockaddr *) &server, sizeof(server)) == -1) {
      perror("bind");
      return;
 }   

 if (listen(serv_fd, 3)) {
      perror("listen");
      return;
 }   

 printf("Server is listening on %u\n", LISTEN_PORT);

Could someone shed some light on my problem? I would be very grateful!
Can it be related to some recent Linux kernel modifications? Do I need to tune some Linux kernel settings or check some user-mode settings (f.e. socket options or whatever)?

P.S. The problem is unstable.

UPD:

tcpdump's output:

IP 10.0.0.34.31334 > 10.0.0.99.12345: Flags [S], seq 426261790, win 43690, options [mss 65495,sackOK,TS val 799180610 ecr 0,nop,wscale 7], length 0
IP 10.0.0.99.12345 > 10.0.0.34.31334: Flags [S.], seq 803872704, ack 426261791, win 65483, options [mss 65495,sackOK,TS val 799180567 ecr 799180610,nop,wscale 0], length 0
IP 10.0.0.34.31334 > 10.0.0.99.12345: Flags [.], ack 1, win 342, options [nop,nop,TS val 799180610 ecr 799180567], length 0
IP 10.0.0.34.31334 > 10.0.0.99.12345: Flags [P.], seq 1:1301, ack 1, win 342, options [nop,nop,TS val 799180610 ecr 799180567], length 1300
IP 10.0.0.34.31334 > 10.0.0.99.12345: Flags [P.], seq 1301:1804, ack 1, win 342, options [nop,nop,TS val 799181412 ecr 799180610], length 503
IP 10.0.0.99.12345 > 10.0.0.34.31334: Flags [.], ack 1804, win 512, options [nop,nop,TS val 799181412 ecr 799181412], length 0

10.0.0.34.31334 is a client, 10.0.0.99.12345 is a server. Pay attention to unexpected win 512 in the last line.

UPD2: I saw several messages about SYN-cookies in dmesg like:

possible SYN flooding on port 12345. Sending cookies.

But they are not so time related with slow transmissions.

like image 276
z0lupka Avatar asked Jan 16 '20 17:01

z0lupka


People also ask

Why is TCP so slow?

TCP slow start is part of the congestion control algorithms put in place by TCP to help control the amount of data flowing through to a network. This helps regulate the case where too much data is sent to a network and the network is incapable of processing that amount of data, thus resulting in network congestion.

How break TCP connection in Linux?

Use tcpkill command to kill specified in-progress TCP connections. It is useful for libnids-based applications which require a full TCP 3-whs for TCB creation.


1 Answers

I'm not really sure this is exactly your case, but it looks similar. Seems that it's a known problem.

Reasons

A number of circumstances can lead to such Linux kernel behavior:

  • Specificity of kernel connection handling in SYN-cookies context with connections having zero Window Scale (or if WS modified in some other way).
  • Zero Window Scale you provoked by setsockopt() with SO_RCVBUF (see tcp_select_initial_window())
  • Extremely small backlog

Explanation

About "slow" transmission:
The Windows Scaling option is calculated at [SYN - SYN+ACK] stage by both hosts. Roughly speaking Host A says "imply my TCP window size on N during future exchange" (SYN) then Host B says "imply my TCP window size on M during future exchange" (SYN+ACK) - here N and M could be th same. So, in a normal situation, these coefficients are stored and eventually used while data exchange.
But TCP SYN-cookies technique implies forgetfulness about [SYN - SYN+ACK] stage of connection (some stated options including WS will be lost after SYN+ACK). In that case Linux kernel re-calculates WS value when ACK arrives (if the ACK has arrived, then creating a regular connection is needed). But that second recalculation could be a bit different because setsockopt() does not affect it (for some objective reasons). Here you face with situation, when your server sends zero Window Scale option with SYN+ACK, then forgets about it, then re-spawn connection (when ACK arrives) as it was with some default Window Scale (e.g. 7) and use little window implying that client will multiply it by 128. But client doesn't forget that WS is 0 and treats little window size as real - hence it send a little portions of data - hereby your "slow" connection takes the stage.

About SYN-flood:
When you have such a little backlog an simple 3 SYN-retransmits can provoke SYN-cookies (i.e. will fill in your backlog queue). BTW do you see retransmissions in tcpdump?
From ip-sysctl.txt:

Note, that syncookies is fallback facility.
It MUST NOT be used to help highly loaded servers to stand
against legal connection rate. If you see SYN flood warnings
in your logs, but investigation shows that they occur
because of overload with legal connections, you should tune
another parameters until this warning disappear.
See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.

syncookies seriously violate TCP protocol, do not allow
to use TCP extensions, can result in serious degradation
of some services (f.e. SMTP relaying), visible not by you,
but your clients and relays, contacting you. While you see
SYN flood warnings in logs not being really flooded, your server
is seriously misconfigured.

So if there are no SYN-flood attacks in your LAN - your server is seriously misconfigured. SYN-cookies should do its job only when SYN-flood attack is present.


Solution

Concluding, there can be some activities to eliminate the problem:

  1. If there is a real SYN-flood in your network - SYN-cookies partially solve this information security issue. With a real attack, there’s no time to think about slow connections. This is an emergency.
  2. If nope, i.e. some SYN-retransmissions provoked SYN-cookies:
    • thoughtfully increase backlog to eliminate such conditions;
    • don't do setsockopt() with SO_RCVBUF on listening socket. It doesn't make much sense. Without doing setsockopt() you can reduce the probability of different WS calculations by kernel in mentioned scenario. Btw you can set SO_RCVBUF on accepted socket if needed.

Repro

I reproduced your problem with simple client and server with hping3 under approximate conditions. So you can stuff server's backlog queue with:

hping3 -c 3 -S -p 12345 --fast 10.0.0.99

then initiate connection from client - the connection will be opened in the so-called "SYN-cookies context" at least on 4.4 kernel. You can also check it on 3.10 kernel increasing -c from 3 to X up to successful reproduction.

like image 146
red0ct Avatar answered Sep 30 '22 19:09

red0ct