Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling multiple recv() calls and all possible scenarios

I am fairly new to C and writing a TCP server, and was wondering how to handle recv()s from a client who will send commands that the server will respond to. For the sake of this question, let's just say header is 1st byte, command identifier is 2nd byte, and payload length is 3rd byte, followed by the payload (if any).

What is the best way to recv() this data? I was thinking to call recv() to read in the first 3 bytes into the buffer, check to make sure header and command identifiers are valid, then check payload length and call recv() again with payload length as length and add this to the back of the aforementioned buffer. Reading Beej's networking article (particularly the section Son of Data Encapsulation)), however, he advises to use "an array big enough for two [max length] packets" to handle situations such as getting some of the next packet.

What is the best way to handle these types of recv()s? Basic question, but I would like to implement it efficiently, handling all cases that can arise. Thanks in advance.

like image 347
Jack Avatar asked Dec 02 '10 15:12

Jack


2 Answers

Nice question. How perfect do you want to go? For an all singing all dancing solution, use asynchronous sockets, read all the data you can whenever you can, and whenever you get new data call some data processing function on the buffer.

This allows you to do big reads. If you get a lot of commands pipelined you could potentially process them without having to wait on the socket again, thus increasing performance and response time.

Do something similar on the write. That is the command processing function writes to a buffer. If there is data in the buffer then when checking sockets (select or poll) check for writeability and write as much as you can, remembering to only remove the bytes actually written from the buffer.

Circular buffers work well in such situations.

There are lighter simpler solutions. However this one is a good one. Remember that a server might get multiple connections and packets can be split. If you read from a socket into a buffer only to find you don;t have the data for a complete command, what do you do with the data you have already read? Where do you store it? If you store it in a buffer associated with that connection, then you might as well go the whole hog and just read into the buffer as described above in the first place.

This solution also avoids having to spawn a separate thread for each connection - you can handle any number of connections without any real problems. Spawning a thread per connection is an unnecessary waste of system resources - except in certain circumstances where multiple threads would be recommended anyway, and for that you can simply have worker threads to execute such blocking tasks while keeping the socket handling single threaded.

Basically I agree with what you say Beej says, but don't read tiddly bits at a time. Read big chunks at a time. Writing a socket server like this, learning and designing as I went along based on a tiny bit of socket experience and man pages, was one of the most fun projects I have ever worked on, and very educational.

like image 31
AlastairG Avatar answered Oct 05 '22 06:10

AlastairG


The method that Beej is alluding to, and AlastairG mentions, works something like this:

For each concurrent connection, you maintain a buffer of read-but-not-yet-processed data. (This is the buffer that Beej suggests sizing to twice the maximum packet length). Obviously, the buffer starts off empty:

unsigned char recv_buffer[BUF_SIZE];
size_t recv_len = 0;

Whenever your socket is readable, read into the remaining space in the buffer, then immediately try and process what you have:

result = recv(sock, recv_buffer + recv_len, BUF_SIZE - recv_len, 0);

if (result > 0) {
    recv_len += result;
    process_buffer(recv_buffer, &recv_len);
}

The process_buffer() will try and process the data in the buffer as a packet. If the buffer doesn't contain a full packet yet, it just returns - otherwise, it processes the data and removes it from the buffer. So for you example protocol, it would look something like:

void process_buffer(unsigned char *buffer, size_t *len)
{
    while (*len >= 3) {
        /* We have at least 3 bytes, so we have the payload length */

        unsigned payload_len = buffer[2];

        if (*len < 3 + payload_len) {
            /* Too short - haven't recieved whole payload yet */
            break;
        }

        /* OK - execute command */
        do_command(buffer[0], buffer[1], payload_len, &buffer[3]);

        /* Now shuffle the remaining data in the buffer back to the start */
        *len -= 3 + payload_len;
        if (*len > 0)
            memmove(buffer, buffer + 3 + payload_len, *len);
    }
}

(The do_command() function would check for a valid header and command byte).

This kind of technique ends up being necessary, because any recv() can return a short length - with your proposed method, what happens if your payload length is 500, but the next recv() only returns you 400 bytes? You'll have to save those 400 bytes until the next time the socket becomes readable anyway.

When you handle multiple concurrent clients, you simply have one recv_buffer and recv_len per client, and stuff them into a per-client struct (which likely contains other things too - like the client's socket, perhaps their source address, current state etc.).

like image 82
caf Avatar answered Oct 05 '22 07:10

caf