Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching binary patterns in C

I'm currently developing a C program that needs to parse some bespoke data structures, fortunately I know how they are structured, however I'm not sure of how to implement my parser in C.

Each of the structures are 32-bits in length, and each structure can be identified by it's binary signature.

As an example, there are two particular structures that I'm interested in, and they have the following binary patterns (x means either 0 or 1)

 0000-00xx-xxxx-xxx0
 0000-10xx-10xx-xxx0

Within these structures the 'x' bits contain the actual data I need, so essentially I need a way of identifying each structure based on how the bits are written within each structure.

So as an example in pseudo-code:

if (binaryPattern = 000010xxxxxxxxx0) {
do something with it;
}

I'm guessing that reading them as ints, and then performing some kind of bitmasking would be the way to go, but my knowledge of C isn't great, and maybe a simple logical OR operation would do it, but I just wanted some advice on doing this before I start.

Thanks

Thanks very much to everyone that has answered, very helpful!!

like image 833
Tony Avatar asked Jan 17 '13 13:01

Tony


2 Answers

To check if your data matches a specific binary pattern, you can first mask out the non-signature bits, then compare it against a signature template.

For example, to check if your data matches the 0000 10xx 10xx xxx0 signature:

  1. AND your input data with 1111 1100 1100 0001 (the mask)
  2. check if the output equals 0000 1000 1000 0000 (the template)

To illustrate with some sample data:

DATA_1   0010 1011 1101 1100                DATA_2   0000 1011 1010 1100
  MASK   1111 1100 1100 0001  &               MASK   1111 1100 1100 0001  &
        --------------------                        --------------------
         0010 1000 1100 0000 (NO_MATCH)              0000 1000 1000 0000 (MATCH)
        --------------------                        --------------------

Each of you rules could therefore be represented by a mask-template pair and all you need is a function/operation that applies the above operation to your data to check if it is a match.

like image 162
Shawn Chin Avatar answered Sep 22 '22 18:09

Shawn Chin


BTW you only showed 16-bit patterns, not 32-bit...

Anyway, you can just define masks that represent which part of the pattern is of interest to you. The perform a bitwise AND with your value and the mask, and if the result is the test pattern, you have found what you want.

#define PATTERN1 0x0000
#define MASK1 0xfc01

#define PATTERN2 0x0880
#define MASK2 0xfcc1

if ((value & MASK1) == PATTERN1) {
  // have pattern 1
}
else if ((value & MASK2) == PATTERN2) {
  // have pattern 2
}

If you have more patterns, it's obviously best to put the patterns and masks in a table and loop over it.

like image 37
Kalle Pokki Avatar answered Sep 19 '22 18:09

Kalle Pokki