Given a bytearray uint8_t data[N]
what is an efficient method to find a byte uint8_t search
within it even if search
is not octet aligned? i.e. the first three bits of search
could be in data[i]
and the next 5 bits in data[i+1]
.
My current method involves creating a bool get_bit(const uint8_t* src, struct internal_state* state)
function (struct internal_state
contains a mask that is bitshifted right, &
ed with src and returned, maintaining size_t src_index < size_t src_len
) , leftshifting the returned bits into a uint8_t my_register
and comparing it with search
every time, and using state->src_index
and state->src_mask
to get the position of the matched byte.
Is there a better method for this?
If you're searching an eight bit pattern within a large array you can implement a sliding window over 16 bit values to check if the searched pattern is part of the two bytes forming that 16 bit value.
To be portable you have to take care of endianness issues which is done by my implementation by building the 16 bit value to search for the pattern manually. The high byte is always the currently iterated byte and the low byte is the following byte. If you do a simple conversion like value = *((unsigned short *)pData)
you will run into trouble on x86 processors...
Once value
, cmp
and mask
are setup cmp
and mask
are shifted. If the pattern was not found within hi high byte the loop continues by checking the next byte as start byte.
Here is my implementation including some debug printouts (the function returns the bit position or -1 if pattern was not found):
int findPattern(unsigned char *data, int size, unsigned char pattern)
{
int result = -1;
unsigned char *pData;
unsigned char *pEnd;
unsigned short value;
unsigned short mask;
unsigned short cmp;
int tmpResult;
if ((data != NULL) && (size > 0))
{
pData = data;
pEnd = data + size;
while ((pData < pEnd) && (result == -1))
{
printf("\n\npData = {%02x, %02x, ...};\n", pData[0], pData[1]);
if ((pData + 1) < pEnd) /* still at least two bytes to check? */
{
tmpResult = (int)(pData - data) * 8; /* calculate bit offset according to current byte */
/* avoid endianness troubles by "manually" building value! */
value = *pData << 8;
pData++;
value += *pData;
/* create a sliding window to check if search patter is within value */
cmp = pattern << 8;
mask = 0xFF00;
while (mask > 0x00FF) /* the low byte is checked within next iteration! */
{
printf("cmp = %04x, mask = %04x, tmpResult = %d\n", cmp, mask, tmpResult);
if ((value & mask) == cmp)
{
result = tmpResult;
break;
}
tmpResult++; /* count bits! */
mask >>= 1;
cmp >>= 1;
}
}
else
{
/* only one chance left if there is only one byte left to check! */
if (*pData == pattern)
{
result = (int)(pData - data) * 8;
}
pData++;
}
}
}
return (result);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With