Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Buffering data from sockets?

Tags:

c

sockets

buffer

I am trying to make a simple HTTP server that would be able to parse client requests and send responses back.

Now I have a problem. I have to read and handle one line at a time in the request, and I don't know if I should:

  • read one byte at a time, or
  • read chunks of N bytes at a time, put them in a buffer, and then handle the bytes one by one, before reading a new chunk of bytes.

What would be the best option, and why?

Also, are there some alternative solutions to this? Like a function that would read a line at a time from the socket or something?

like image 719
Frxstrem Avatar asked Dec 23 '22 00:12

Frxstrem


2 Answers

Single byte at a time is going to kill performance. Consider a circular buffer of decent size.

Read chunks of whatever size is free in the buffer. Most of the time you will get short reads. Check for the end of the http command in the read bytes. Process complete commands and next byte becomes head of buffer. If buffer becomes full, copy it off to a backup buffer, use a second circular buffer, report an error or whatever is appropriate.

like image 118
Duck Avatar answered Jan 06 '23 04:01

Duck


The short answer to your question is that I would go with reading a single byte at a time. Unfortunately its one of those cases where there are pros and cons for both cases.

For the use of a buffer is the fact that the implementation can be more efficient from the perspective of the network IO. Against the use of a buffer, I think that the code will be inherently more complex than the single byte version. So its an efficiency vs complexity trade off. The good news is that you can implement the simple solution first, profile the result and "upgrage" to a buffered approach if testing shows it to be worthwhile.

Also, just to note, as a thought experiment I wrote some pseudo code for a loop that does buffer based reads of http packets, included below. The complexity to implement a buffered read doesn't seem to bad. Note however that I haven't given much consideration to error handling, or tested if this will work at all. However, it should avoid excessive "double handling" of data, which is important since that would reduce the efficiency gains which were the purpose of this approach.

#define CHUNK_SIZE 1024

nextHttpBytesRead = 0;
nextHttp = NULL;
while (1)
{
  size_t httpBytesRead = nextHttpBytesRead;
  size_t thisHttpSize;
  char *http = nextHttp;
  char *temp;
  char *httpTerminator;

  do
  {
    temp = realloc(httpBytesRead + CHUNK_SIZE);
    if (NULL == temp)
      ...
    http = temp;

    httpBytesRead += read(httpSocket, http + httpBytesRead, CHUNK_SIZE);
    httpTerminator = strstr(http, "\r\n\r\n");
  }while (NULL == httpTerminator)

  thisHttpSize = ((int)httpTerminator - (int)http + 4; // Include terminator
  nextHttpBytesRead = httpBytesRead - thisHttpSize;

  // Adding CHUNK_SIZE here means that the first realloc won't have to do any work
  nextHttp = malloc(nextHttpBytesRead + CHUNK_SIZE);
  memcpy(nextHttp,  http + thisHttpSize, nextHttpSize);

  http[thisHttpSize] = '\0';
  processHttp(http);
}
like image 23
torak Avatar answered Jan 06 '23 05:01

torak