Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tcp connect fails randomly under high load

Our application uses a non-blocking socket usage with connect and select operations (c code). The pusedo code is as below:

unsigned int ConnectToServer(struct sockaddr_in *pSelfAddr,struct sockaddr_in *pDestAddr)
    {
         int sktConnect = -1;
         sktConnect = socket(AF_INET,SOCK_STREAM,0);
         if(sktConnect == INVALID_SOCKET)
             return -1;
         fcntl(sktConnect,F_SETFL,fcntl(sktConnect,F_GETFL) | O_NONBLOCK);
         if(pSelfAddr != 0)
         {
             if(bind(sktConnect,(const struct sockaddr*)(void *)pSelfAddr,sizeof(*pSelfAddr)) != 0)
             {
                 closesocket(sktConnect);
                 return -1;
             }
         }
         errno = 0;
         int nRc = connect(sktConnect,(const struct sockaddr*)(void *)pDestAddr, sizeof(*pDestAddr));
         if(nrC != -1)
         {
             return sktConnect;
         }
         if(errno != EINPROGRESS)
         {
             int savedError = errno;
             closesocket(sktConnect);
             return -1;
         }
         fd_set scanSet;
         FD_ZERO(&scanSet);
         FD_SET(sktConnect,&scanSet);
         struct timeval waitTime;
         waitTime.tv_sec = 2;
         waitTime.tv_usec = 0;
         int tmp;
         tmp = select(sktConnect +1, (fd_set*)0, &scanSet, (fd_set*)0,&waitTime);
         if(tmp == -1 || !FD_ISSET(sktConnect,&scanSet))
         {
             int savedErrorNo = errno;
             writeLog("Connect %s failed after select, cause %d, error %s",inet_ntoa(pDestAddr->sin_addr),savedErrorNo,strerror(savedErrorNo));
             closesocket(sktConnect);
             return -1;
         } 
    .        .        .     .     .}

There are 80 such nodes and the application connects to all its peer in round-robin fashion. In this phase, some of the nodes are not able to connect (api – connect + select) with error number 115.

In the below logs (of tcpdump output) for success scenario, we can see (SYN, SYN+ACK, ACK) but no entry of even SYN is present for failed node in tcpdump logs.

The tcpdump logs are:

387937 2012-07-05 07:45:30.646514 10.18.92.173          10.137.165.136        TCP      33728 > 8441 [SYN] Seq=0 Ack=0 Win=5792 Len=0 MSS=1460 TSV=1414450402 TSER=912308224 WS=8 
387947 2012-07-05 07:45:30.780762 10.137.165.136        10.18.92.173          TCP      8441 > 33728 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=912309754 TSER=1414450402 WS=8 
387948 2012-07-05 07:45:30.780773 10.18.92.173          10.137.165.136        TCP      33728 > 8441 [ACK] Seq=1 Ack=1 Win=5888 Len=0 TSV=1414450435 TSER=912309754 
All the above three events indicate the success information.
387949 2012-07-05 07:45:30.782652 10.18.92.173          10.137.165.136        TCP      33728 > 8441 [PSH, ACK] Seq=1 Ack=1 Win=5888 Len=320 TSV=1414450436 TSER=912309754 
387967 2012-07-05 07:45:30.915615 10.137.165.136        10.18.92.173          TCP      8441 > 33728 [ACK] Seq=1 Ack=321 Win=6912 Len=0 TSV=912309788 TSER=1414450436 
388011 2012-07-05 07:45:31.362712 10.18.92.173          10.137.165.136        TCP      33728 > 8441 [PSH, ACK] Seq=321 Ack=1 Win=5888 Len=320 TSV=1414450581 TSER=912309788 
388055 2012-07-05 07:45:31.495558 10.137.165.136        10.18.92.173          TCP      8441 > 33728 [ACK] Seq=1 Ack=641 Win=7936 Len=0 TSV=912309933 TSER=1414450581 
388080 2012-07-05 07:45:31.702336 10.137.165.136        10.18.92.173          TCP      8441 > 33728 [PSH, ACK] Seq=1 Ack=641 Win=7936 Len=712 TSV=912309985 TSER=1414450581 
388081 2012-07-05 07:45:31.702350 10.18.92.173          10.137.165.136        TCP      33728 > 8441 [ACK] Seq=641 Ack=713 Win=7424 Len=0 TSV=1414450666 TSER=912309985 
388142 2012-07-05 07:45:32.185612 10.137.165.136        10.18.92.173          TCP      8441 > 33728 [PSH, ACK] Seq=713 Ack=641 Win=7936 Len=320 TSV=912310106 TSER=1414450666 
388143 2012-07-05 07:45:32.185629 10.18.92.173          10.137.165.136        TCP      33728 > 8441 [ACK] Seq=641 Ack=1033 Win=8704 Len=0 TSV=1414450786 TSER=912310106 
388169 2012-07-05 07:45:32.362622 10.18.92.173          10.137.165.136        TCP      33728 > 8441 [PSH, ACK] Seq=641 Ack=1033 Win=8704 Len=320 TSV=1414450831 TSER=912310106 
388212 2012-07-05 07:45:32.494833 10.137.165.136        10.18.92.173          TCP      8441 > 33728 [ACK] Seq=1033 Ack=961 Win=9216 Len=0 TSV=912310183 TSER=1414450831 
388219 2012-07-05 07:45:32.501613 10.137.165.136        10.18.92.173          TCP      8441 > 33728 [PSH, ACK] Seq=1033 Ack=961 Win=9216 Len=356 TSV=912310185 TSER=1414450831 
388220 2012-07-05 07:45:32.501624 10.18.92.173          10.137.165.136        TCP      33728 > 8441 [ACK] Seq=961 Ack=1389 Win=10240 Len=0 TSV=1414450865 TSER=912310185 

Application Logs informing error on connect (i.e api - connect + select)

[5258: 2012-07-05 07:45:30]Connect [10.137.165.136 <- 10.18.92.173] success. 
[5258: 2012-07-05 07:45:32]Connect 10.137.165.137 fail after select, cause:115, error Operation now in progress. Check whether remote machine exist and the network is normal or not. 
[5258: 2012-07-05 07:45:32]Connect to server([10.137.165.137 <- 10.18.92.173], port=8441) Failed!

Success logs corresponding to first 3 entries of tcpdump. And failure log where there are no events in the tcpdump

My question is : When client initiates “connect” api for failed case, i am not able to see any event in the tcpdump at client side (even initial SYN). What can be the reason of this randomness.

like image 626
Baplix Avatar asked Nov 03 '22 20:11

Baplix


1 Answers

You have hit EINPROGRESS. From the connect man page:

The socket is nonblocking and the connection cannot be completed immediately. It is possible to select(2) or poll(2) for completion by selecting the socket for writing. After select(2) indicates writability, use getsockopt(2) to read the SO_ERROR option at level SOL_SOCKET to determine whether connect() completed successfully (SO_ERROR is zero) or unsuccessfully (SO_ERROR is one of the usual error codes listed here, explaining the reason for the failure).

This is saying that EINPROGRESS is an indicator that the kernel is not able to complete the connection now, even though there are available local ports and routing cache entries. It seems this occurs when the socket state has not transitioned to "ESTABLISHED" yet. Just wait on the socket in select again, but call getsockopt afterwards to see if your connect had completed.

As to why, the socket transitions to SYN_SENT state during connect, but the packet may still be in the output queue and has not actually made it to the network device buffer yet.

like image 108
jxh Avatar answered Nov 07 '22 23:11

jxh