protocol parsing in c

Tags:

I have been playing around with trying to implement some protocol decoders, but each time I run into a "simple" problem and I feel the way I am solving the problem is not optimal and there must be a better way to do things. I'm using C. Currently I'm using some canned data and reading it in as a file, but later on it would be via TCP or UDP.

Here's the problem. I'm currently playing with a binary protocol at work. All fields are 8 bits long. The first field(8bits) is the packet type. So I read in the first 8 bits and using a switch/case I call a function to read in the rest of the packet as I then know the size/structure of it. BUT...some of these packets have nested packets inside them, so when I encounter that specific packet I then have to read another 8-16 bytes have another switch/case to see what the next packet type is and on and on. (Luckily the packets are only nested 2 or 3 deep). Only once I have the whole packet decoded can I handle it over to my state machine for processing.

I guess this can be a more general question as well. How much data do you have to read at a time from the socket? As much as possible? As much as what is "similar" in the protocol headers?

So even though this protocol is fairly basic, my code is a whole bunch of switch/case statements and I do a lot of reading from the file/socket which I feel is not optimal. My main aim is to make this decoder as fast as possible. To the more experienced people out there, is this the way to go or is there a better way which I just haven't figured out yet? Any elegant solution to this problem?

942

asked Jun 04 '10 12:06

NomadAlien

2 Answers

I recommend this approach:

Read all that you can from the file/socket (separate the data communication from the actual protocol)
Pass the data you have read to a procedure for handling data

Pseudo C code (imagine that destinationBuffer is a circular buffer - I believe such data structure is vital in case of applications that need to parse a lot of incoming data):

forever()
{
  // this function adds data to the buffer updating it
  read_all_you_can(destinationBuffer);
  ...
  handle_data(destinationBuffer);
  // the buffer is automatically adjusted in order
  // to reflect how much of the data was processed
}

Generally it is better to read as much as possible in order to have more performance.

answered Nov 09 '22 23:11

INS

Resist the temptation to optimize prematurely. First make it work, only then should you think about whether it needs optimization. If you do, do so scientifically: benchmark your code and go for the lowest-hanging fruit first, don't rely on gut feel.

Don't forget that your OS will probably be buffering the data itself, whether you are reading from a file or a socket. Still, repeated syscalls are likely to be a bottleneck, so they may well be a straightforward optimization win. At a former workplace we avoided this issue by having our packet header explicitly encode its length (never more than 8k): that way we knew exactly how much to bulk-read into an array, then our own buffering code took over.

answered Nov 09 '22 23:11

crazyscot

Related questions
                            
                                assignment discards 'volatile' qualifier from pointer target type
                            
                                C code with undefined results, compiler generates invalid code (with -O3)
                            
                                Mandatory parameter getopt in C
                            
                                Is dereferencing a pointer that's equal to nullptr undefined behavior by the standard?
                            
                                Why do RTOS tasks have to be executed in infinite loop?
                            
                                How are dev_*() family functions useful while debugging the Linux kernel?
                            
                                Is it good to use exit() instead of fcloseall() for closing multiple files?
                            
                                What was the first programming language with Enumerations?
                            
                                How to execute a shell script using CMake post_build?
                            
                                C - Sort float array while keeping track of indices
                            
                                gcc cannot find -lglfw3
                            
                                Is stdio.h a library?
                            
                                Union not reinterpreting values?
                            
                                Call recv() on the same blocking socket from two threads
                            
                                select(), recv() and EWOULDBLOCK on non-blocking sockets
                            
                                Efficient computation of the high order bits of a 32 bit integer multiplication
                            
                                How can I allocate memory in the Linux kernel for a char* type string?
                            
                                Why doesn't gcc remove this check of a non-volatile variable?
                            
                                Tool for detecting memory leaks [closed]
                            
                                What is the nicest way to parse this in C++?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

protocol parsing in c

Tags:

c

parsing

protocols

NomadAlien

People also ask

2 Answers

INS

crazyscot

Recent Activity

Donate For Us