Fastest way to scan for bit pattern in a stream of bits

Question

Using simple brute force is sometimes good.

I think precalc all shifted values of the word and put them in 16 ints so you got an array like this (assuming int is twice as wide as short)

 unsigned short pattern = 1234;
 unsigned int preShifts[16];
 unsigned int masks[16];
 int i;
 for(i=0; i<16; i++)
 {
      preShifts[i] = (unsigned int)(pattern<<i);  //gets promoted to int
      masks[i] = (unsigned int) (0xffff<<i);
 }

and then for every unsigned short you get out of the stream, make an int of that short and the previous short and compare that unsigned int to the 16 unsigned int's. If any of them match, you got one.

So basically like this:

  int numMatch(unsigned short curWord, unsigned short prevWord)
  {
       int numHits = 0;
       int combinedWords = (prevWord<<16) + curWord;

       int i=0;
       for(i=0; i<16; i++)
       {
             if((combinedWords & masks[i]) == preShifsts[i]) numHits++;
       }
       return numHits;
  }

Do note that this could potentially mean multiple hits when the patterns is detected more than once on the same bits:

e.g. 32 bits of 0's and the pattern you want to detect is 16 0's, then it would mean the pattern is detected 16 times!

The time cost of this, assuming it compiles approximately as written, is 16 checks per input word. Per input bit, this does one & and ==, and branch or other conditional increment. And also a table lookup for the mask for every bit.

The table lookup is unnecessary; by instead right-shifting combined we get significantly more efficient asm, as shown in another answer which also shows how to vectorize this with SIMD on x86.

Here is a trick to speed up the search by a factor of 32, if neither the Knuth-Morris-Pratt algorithm on the alphabet of two characters {0, 1} nor reinier's idea are fast enough.

You can first use a table with 256 entries to check for each byte in your bit stream if it is contained in the 16-bit word you are looking for. The table you get with

unsigned char table[256];
for (int i=0; i<256; i++)
  table[i] = 0; // initialize with false
for (i=0; i<8; i++)
  table[(word >> i) & 0xff] = 1; // mark contained bytes with true

You can then find possible positions for matches in the bit stream using

for (i=0; i<length; i++) {
  if (table[bitstream[i]]) {
    // here comes the code which checks if there is really a match
  }
}

As at most 8 of the 256 table entries are not zero, in average you have to take a closer look only at every 32th position. Only for this byte (combined with the bytes one before and one after) you have then to use bit operations or some masking techniques as suggested by reinier to see if there is a match.

The code assumes that you use little endian byte order. The order of the bits in a byte can also be an issue (known to everyone who already implemented a CRC32 checksum).

I would like to suggest a solution using 3 lookup tables of size 256. This would be efficient for large bit streams. This solution takes 3 bytes in a sample for comparison. Following figure shows all possible arrangements of a 16 bit data in 3 bytes. Each byte region has shown in different color.

alt text http://img70.imageshack.us/img70/8711/80541519.jpg

Here checking for 1 to 8 will be taken care in first sample and 9 to 16 in next sample and so on. Now when we are searching for a Pattern, we will find all the 8 possible arrangements (as below) of this Pattern and will store in 3 lookup tables (Left, Middle and Right).

Initializing Lookup Tables:

Lets take an example 0111011101110111 as a Pattern to find. Now consider 4th arrangement. Left part would be XXX01110. Fill all raws of Left lookup table pointing by left part (XXX01110) with 00010000. 1 indicates starting position of arrangement of input Pattern. Thus following 8 raws of Left look up table would be filled by 16 (00010000).

Middle part of arrangement would be 11101110. Raw pointing by this index (238) in Middle look up table will be filled by 16 (00010000).

Now Right part of arrangement would be 111XXXXX. All raws (32 raws) with index 111XXXXX will be filled by 16 (00010000).

We should not overwrite elements in look up table while filling. Instead do a bitwise OR operation to update an already filled raw. In above example, all raws written by 3rd arrangement would be updated by 7th arrangement as follows.

alt text

Thus raws with index XX011101 in Left lookup table and 11101110 in Middle lookup table and 111XXXXX in Right lookup table will be updated to 00100010 by 7th arrangement.

Searching Pattern:

Take a sample of three bytes. Find Count as follows where Left is left lookup table, Middle is middle lookup table and Right is right lookup table.

Count = Left[Byte0] & Middle[Byte1] & Right[Byte2];

Number of 1 in Count gives the number of matching Pattern in taken sample.

I can give some sample code which is tested.

Initializing lookup table:

    for( RightShift = 0; RightShift < 8; RightShift++ )
    {
        LeftShift = 8 - RightShift;

        Starting = 128 >> RightShift;

        Byte = MSB >> RightShift;

        Count = 0xFF >> LeftShift;

        for( i = 0; i <= Count; i++ )
        {
            Index = ( i << LeftShift ) | Byte;

            Left[Index] |= Starting;
        }

        Byte = LSB << LeftShift;

        Count = 0xFF >> RightShift;

        for( i = 0; i <= Count; i++ )
        {
            Index = i | Byte;

            Right[Index] |= Starting;
        }

        Index = ( unsigned char )(( Pattern >> RightShift ) & 0xFF );

        Middle[Index] |= Starting;
    }

Searching Pattern:

Data is stream buffer, Left is left lookup table, Middle is middle lookup table and Right is right lookup table.

for( int Index = 1; Index < ( StreamLength - 1); Index++ )
{
    Count = Left[Data[Index - 1]] & Middle[Data[Index]] & Right[Data[Index + 1]];

    if( Count )
    {
        TotalCount += GetNumberOfOnes( Count );
    }
}

Limitation:

Above loop cannot detect a Pattern if it is placed at the very end of stream buffer. Following code need to add after loop to overcome this limitation.

Count = Left[Data[StreamLength - 2]] & Middle[Data[StreamLength - 1]] & 128;

if( Count )
{
    TotalCount += GetNumberOfOnes( Count );
}

Advantage:

This algorithm takes only N-1 logical steps to find a Pattern in an array of N bytes. Only overhead is to fill the lookup tables initially which is constant in all the cases. So this will be very effective for searching huge byte streams.

My money's on Knuth-Morris-Pratt with an alphabet of two characters.

Fastest way to scan for bit pattern in a stream of bits

Tags:

c++

c

algorithm

assembly

embedded

Recent Activity

Donate For Us

Fastest way to scan for bit pattern in a stream of bits

Tags:

c++

c

algorithm

assembly

embedded

Related questions

Recent Activity

Donate For Us