Unusually slow TCP-connection in Linux

Tags:

I wrote user-mode client-server c application based on Berkeley sockets that interact over some private network.
The situation is definitely strange. Occasionally the connection becomes very slow under some vague circumstances. The normal TCP data exchange in my case is about 10-25 Kbytes payload per segment, but sometimes it becomes about ~200-500 bytes per segment.

After some troubleshooting, I realized that this problem is not reproducible for other network services, thus it looks like my service is to blame. But I can't figure out, what's wrong. It worked well on 3.10 Linux kernel, but have that strange behavior on 4.4. Could it be some internal kernel changes which caused such problem?

I tried to play with Linux sysctl settings:

Click to copy

net.ipv4.tcp_congestion_control
net.ipv4.tcp_sack
net.ipv4.route.flush

but that did not help.

Seems that the problem appears at listen socket side. In tcpdump the TCP Window size is OK while handshaking. But after first incoming packet window size reduces (by listener's side).

UPD
Here is my server-side code snippet:

Click to copy

 serv_fd = socket(AF_INET, SOCK_STREAM, 0); 
 if (serv_fd == -1) {
      perror("socket");
      return;
 }   

 server.sin_family = AF_INET;
 server.sin_port = htons(LISTEN_PORT);
 server.sin_addr.s_addr = htonl(INADDR_ANY);

 #ifdef SET_BUF
 if (setsockopt(serv_fd, SOL_SOCKET, SO_RCVBUF, &buflen, sizeof(int)) == -1) {
      perror ("setsockopt");
      return;
 }   
 if (setsockopt(serv_fd, SOL_SOCKET, SO_SNDBUF, &buflen, sizeof(int)) == -1) {
      perror ("setsockopt");
      return;
 }   
 #endif // SET_BUF

 if (bind(serv_fd, (struct sockaddr *) &server, sizeof(server)) == -1) {
      perror("bind");
      return;
 }   

 if (listen(serv_fd, 3)) {
      perror("listen");
      return;
 }   

 printf("Server is listening on %u\n", LISTEN_PORT);

Could someone shed some light on my problem? I would be very grateful!
Can it be related to some recent Linux kernel modifications? Do I need to tune some Linux kernel settings or check some user-mode settings (f.e. socket options or whatever)?

P.S. The problem is unstable.

UPD:

tcpdump's output:

Click to copy

IP 10.0.0.34.31334 > 10.0.0.99.12345: Flags [S], seq 426261790, win 43690, options [mss 65495,sackOK,TS val 799180610 ecr 0,nop,wscale 7], length 0
IP 10.0.0.99.12345 > 10.0.0.34.31334: Flags [S.], seq 803872704, ack 426261791, win 65483, options [mss 65495,sackOK,TS val 799180567 ecr 799180610,nop,wscale 0], length 0
IP 10.0.0.34.31334 > 10.0.0.99.12345: Flags [.], ack 1, win 342, options [nop,nop,TS val 799180610 ecr 799180567], length 0
IP 10.0.0.34.31334 > 10.0.0.99.12345: Flags [P.], seq 1:1301, ack 1, win 342, options [nop,nop,TS val 799180610 ecr 799180567], length 1300
IP 10.0.0.34.31334 > 10.0.0.99.12345: Flags [P.], seq 1301:1804, ack 1, win 342, options [nop,nop,TS val 799181412 ecr 799180610], length 503
IP 10.0.0.99.12345 > 10.0.0.34.31334: Flags [.], ack 1804, win 512, options [nop,nop,TS val 799181412 ecr 799181412], length 0

10.0.0.34.31334 is a client, 10.0.0.99.12345 is a server. Pay attention to unexpected win 512 in the last line.

UPD2: I saw several messages about SYN-cookies in dmesg like:

Click to copy

possible SYN flooding on port 12345. Sending cookies.

But they are not so time related with slow transmissions.

276

asked Jan 16 '20 17:01

z0lupka

1 Answers

I'm not really sure this is exactly your case, but it looks similar. Seems that it's a known problem.

Reasons

A number of circumstances can lead to such Linux kernel behavior:

Specificity of kernel connection handling in SYN-cookies context with connections having zero Window Scale (or if WS modified in some other way).
Zero Window Scale you provoked by setsockopt() with SO_RCVBUF (see tcp_select_initial_window())
Extremely small backlog

Explanation

About "slow" transmission:
The Windows Scaling option is calculated at [SYN - SYN+ACK] stage by both hosts. Roughly speaking Host A says "imply my TCP window size on N during future exchange" (SYN) then Host B says "imply my TCP window size on M during future exchange" (SYN+ACK) - here N and M could be th same. So, in a normal situation, these coefficients are stored and eventually used while data exchange.
But TCP SYN-cookies technique implies forgetfulness about [SYN - SYN+ACK] stage of connection (some stated options including WS will be lost after SYN+ACK). In that case Linux kernel re-calculates WS value when ACK arrives (if the ACK has arrived, then creating a regular connection is needed). But that second recalculation could be a bit different because setsockopt() does not affect it (for some objective reasons). Here you face with situation, when your server sends zero Window Scale option with SYN+ACK, then forgets about it, then re-spawn connection (when ACK arrives) as it was with some default Window Scale (e.g. 7) and use little window implying that client will multiply it by 128. But client doesn't forget that WS is 0 and treats little window size as real - hence it send a little portions of data - hereby your "slow" connection takes the stage.

About SYN-flood:
When you have such a little backlog an simple 3 SYN-retransmits can provoke SYN-cookies (i.e. will fill in your backlog queue). BTW do you see retransmissions in tcpdump?
From ip-sysctl.txt:

Click to copy

Note, that syncookies is fallback facility.
It MUST NOT be used to help highly loaded servers to stand
against legal connection rate. If you see SYN flood warnings
in your logs, but investigation shows that they occur
because of overload with legal connections, you should tune
another parameters until this warning disappear.
See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.

syncookies seriously violate TCP protocol, do not allow
to use TCP extensions, can result in serious degradation
of some services (f.e. SMTP relaying), visible not by you,
but your clients and relays, contacting you. While you see
SYN flood warnings in logs not being really flooded, your server
is seriously misconfigured.

So if there are no SYN-flood attacks in your LAN - your server is seriously misconfigured. SYN-cookies should do its job only when SYN-flood attack is present.

Solution

Concluding, there can be some activities to eliminate the problem:

If there is a real SYN-flood in your network - SYN-cookies partially solve this information security issue. With a real attack, there’s no time to think about slow connections. This is an emergency.
If nope, i.e. some SYN-retransmissions provoked SYN-cookies:
- thoughtfully increase backlog to eliminate such conditions;
- don't do setsockopt() with SO_RCVBUF on listening socket. It doesn't make much sense. Without doing setsockopt() you can reduce the probability of different WS calculations by kernel in mentioned scenario. Btw you can set SO_RCVBUF on accepted socket if needed.

Repro

I reproduced your problem with simple client and server with hping3 under approximate conditions. So you can stuff server's backlog queue with:

Click to copy

hping3 -c 3 -S -p 12345 --fast 10.0.0.99

then initiate connection from client - the connection will be opened in the so-called "SYN-cookies context" at least on 4.4 kernel. You can also check it on 3.10 kernel increasing -c from 3 to X up to successful reproduction.

146

answered Sep 30 '22 19:09

red0ct

Related questions
                            
                                Array of structs vs. Array of pointers to structs
                            
                                Compile a shared object (.so) with static glibc
                            
                                C - Why cast to uintptr_t vs char* when doing pointer arithmetic
                            
                                Undefined reference to function accepted by compiler
                            
                                Secure way to realloc
                            
                                Strange behavior performing library functions on STDOUT and STDIN's file descriptors
                            
                                Floating Point Monotonic Property
                            
                                How to filter a multicast receiving socket by interface?
                            
                                How to combine LTO with symbol versioning
                            
                                Create a user token from SID, expand environment variables in user context
                            
                                Does inheritance via unwinding violate strict aliasing rule?
                            
                                Why does @INC change when setgid-bit of C wrapper around perl script change?
                            
                                Is cast of pointer to anonymous union valid in C11?
                            
                                How do I make LeakSanitizer ignore end of program leaks
                            
                                Bitboard to titboard (ternary bitboard) conversion
                            
                                Proper use of constants in C
                            
                                C macro _Generic gives unexpected compiler error
                            
                                access an array from known address
                            
                                Explanation of "effective type"?
                            
                                warning: implicit declaration of function 'getline'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unusually slow TCP-connection in Linux

Tags:

c

linux-kernel

tcp

sockets

network-programming

z0lupka

People also ask

1 Answers

Reasons

Explanation

Solution

Repro

red0ct

Recent Activity

Donate For Us