Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

48-bit string of eight 6-bit units: how to get middle 4 bits of each unit quickly

I have nearly implemented DES algorithm with C language, and I want to optimize my code. So I used gprof. Here is part of the report:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 51.78      9.32     9.32  8000000     1.17     1.17  sboxes
 34.71     15.57     6.25  8000000     0.78     0.78  extendRight
  9.90     17.35     1.78   500000     3.56    35.96  operation
  2.39     17.78     0.43  8000000     0.05     0.05  xorRightAndKey

gprof shows that sboxes function occupied 51.78% of the time.

In sboxes(uchar aucData[6], ...), I was given 48 bits, split them into 8 slot, each slot of 6 bits.

for each slot:

  1. combine first bit with last bit to get X;

  2. obtain middle 4 bit to get Y;

  3. do something with (X, Y);

For example, 011110 is a slot, so X = 00 and Y = 1111.

To implement this, I wrote MACRO to GET/SET bit in memory, here is relative code:

#define LOCATE(ptr, index) (((char *)(ptr))[(index) >> 3])

#define GET_BIT(ptr, index) (LOCATE((ptr), (index)) & (((uchar)0x80) >> ((index) % 8)))

And here is the code to get (X, Y)

uchar basePos = 0x00;
for (int i = 0; i < 8; ++i) {
    x = 0;
    y = 0;
    basePos = i * 6; // to locate the slot
    // combine first bit with last bit 
    if (0 != GET_BIT(aucData, basePos)) {
        x |= 0x02;
    }   
    if (0 != GET_BIT(aucData, basePos + 5)) {
        x |= 0x01;
    }   
    // get continuous 4 bits
    for (int j = 1; j <= 4; ++j) {
        if (0 != GET_BIT(aucData, basePos + j)) {
            y |= (0x01 << (4 - j));
        }   
    }   
    // do something with (x, y)
}

So my question is, I was given 48 bits, how to get the middle 4 bits as fast as possible?

like image 807
Anthony Cooper Avatar asked Sep 28 '22 10:09

Anthony Cooper


1 Answers

Without lookup table:

typedef unsigned long long u64;

void sboxes(uchar aucData[6])
{
    u64 v = aucData[0] + (((u64)aucData[1]) << 8)
    + (((u64)aucData[2]) << 16)
    + (((u64)aucData[3]) << 24)
    + (((u64)aucData[4]) << 32)
    + (((u64)aucData[5]) << 40);

    for(int i = 0; i < 8; i++) 
    {
        uchar x = ((v & 1) << 1) | ((v >> 5) & 1);
        uchar y = ((v >> 1) & 0xF);
        // do something with x, y
        printf("x: %hhu, y: %hhu\n", x, y);

        v >>= 6;
    }
}

Full disclaimer: I didn't benchmark. But it should be fast. You may be able to do the packing into u64 faster, if it's still too slow.

like image 147
tohoho Avatar answered Oct 16 '22 23:10

tohoho