Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What could be the cause of very slow socket reads?

I am using blocking TCP sockets for my client and server. Whenever I read, I first check whether data is available on the stream using select. I always read and write 40 bytes at a time. While most reads take few milliseconds or less, some just take more than half a second. That after I know that there is data available on the socket.

I am also using TCP_NODELAY

What could be causing it ?

EDIT 2

I analyzed the timestamp for each packet sent and received and saw that this delay happens only when client tries to read the object before the next object is written by the server. For instance, the server wrote object number x and after that the client tried to read object x, before the server was able to begin writing object number x+1. This makes me suspect that some kind of coalescing is taking place on the server side.

EDIT

The server is listening on 3 different ports. The client connects one by one to each of these ports.

There are three connections : One that sends some data frequently from the server to the client. A second one that only sends data from the client to the server. And a third one that is used very rarely to send single byte of data. I am facing the problem with the first connection. I am checking using select() that data is available on that connection and then when I timestamp the 40 byte read, I find that about half a second was taken for that read.

Any pointers as to how to profile this would be very helpful

using gcc on linux.

rdrr_server_start(void) {
int rr_sd; int input_sd; int ack_sd; int fp_sd;

startTcpServer(&rr_sd, remote_rr_port); startTcpServer(&input_sd, remote_input_port); startTcpServer(&ack_sd, remote_ack_port); startTcpServer(&fp_sd, remote_fp_port);

connFD_rr = getTcpConnection(rr_sd); connFD_input = getTcpConnection(input_sd); connFD_ack= getTcpConnection(ack_sd); connFD_fp=getTcpConnection(fp_sd); }

static int getTcpConnection(int sd) { socklen_t l en;
struct sockaddr_in clientAddress; len = sizeof(clientAddress); int connFD = accept(sd, (struct sockaddr*) &clientAddress, &len); nodelay(connFD); fflush(stdout); return connFD; }

static void startTcpServer(int *sd, const int port) { *sd= socket(AF_INET, SOCK_STREAM, 0); ASSERT(*sd>0);

// Set socket option so that port can be reused int enable = 1; setsockopt(*sd, SOL_SOCKET, SO_REUSEADDR, &enable, sizeof(int));

struct sockaddr_in a; memset(&a,0,sizeof(a)); a.sin_family = AF_INET; a.sin_port = port; a.sin_addr.s_addr = INADDR_ANY; int bindResult = bind(*sd, (struct sockaddr *) &a, sizeof(a)); ASSERT(bindResult ==0); listen(*sd,2); } static void nodelay(int fd) { int flag=1; ASSERT(setsockopt(fd, SOL_TCP, TCP_NODELAY, &flag, sizeof flag)==0); }

startTcpClient() { connFD_rr = socket(AF_INET, SOCK_STREAM, 0); connFD_input = socket(AF_INET, SOCK_STREAM, 0); connFD_ack = socket(AF_INET, SOCK_STREAM, 0); connFD_fp= socket(AF_INET, SOCK_STREAM, 0);

struct sockaddr_in a; memset(&a,0,sizeof(a)); a.sin_family = AF_INET; a.sin_port = remote_rr_port; a.sin_addr.s_addr = inet_addr(remote_server_ip);

int CONNECT_TO_SERVER= connect(connFD_rr, &a, sizeof(a)); ASSERT(CONNECT_TO_SERVER==0) ;

a.sin_port = remote_input_port; CONNECT_TO_SERVER= connect(connFD_input, &a, sizeof(a)); ASSERT(CONNECT_TO_SERVER==0) ;

a.sin_port = remote_ack_port; CONNECT_TO_SERVER= connect(connFD_ack, &a, sizeof(a)); ASSERT(CONNECT_TO_SERVER==0) ;

a.sin_port = remote_fp_port; CONNECT_TO_SERVER= connect(connFD_fp, &a, sizeof(a)); ASSERT(CONNECT_TO_SERVER==0) ;

nodelay(connFD_rr); nodelay(connFD_input); nodelay(connFD_ack); nodelay(connFD_fp); }

like image 336
AnkurVj Avatar asked Nov 13 '22 04:11

AnkurVj


1 Answers

I would be suspicious of the this line of code:

ASSERT(setsockopt(fd, SOL_TCP, TCP_NODELAY, &flag, sizeof flag)==0);  

If you are running a release build, then ASSERT is mostly likely defined to nothing, so the call would not actually be made. The setsockopt call should not be in the ASSERT statement. Instead, the return value (in a variable) should be verified in the assert statement. Asserts with side effects are generally a bad thing. So even if this is not the problem, it should probably be changed.

like image 118
Mark Wilkins Avatar answered Dec 05 '22 09:12

Mark Wilkins