Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to calculate sum of bits in byte array

Tags:

arrays

c#

byte

bit

I have two byte arrays with the same length. I need to perform XOR operation between each byte and after this calculate sum of bits.

For example:

11110000^01010101 = 10100101 -> so 1+1+1+1 = 4

I need do the same operation for each element in byte array.

like image 789
Andrew Orsich Avatar asked Nov 18 '10 19:11

Andrew Orsich


People also ask

How do you calculate bits in a byte?

A group of eight bits put together is known as a byte. A byte consists of 256 different combinations if you include the number 00000000 - all the binary numbers between 00000000 and 11111111.

How many bits does an array take?

It stores bits using an array of type int (each element in the array usually represents 32 bits).

What is the byte array?

What is a Bytearray? A byte array is simply a collection of bytes. The bytearray() method returns a bytearray object, which is an array of the specified bytes. The bytearray class is a mutable array of numbers ranging from 0 to 256.

How much can a byte array hold?

The standard definition of a byte is a data type that contains 8 bits. With 8 bits, a byte can hold values between zero and 255.


3 Answers

Use a lookup table. There are only 256 possible values after XORing, so it's not exactly going to take a long time. Unlike izb's solution though, I wouldn't suggest manually putting all the values in though - compute the lookup table once at startup using one of the looping answers.

For example:

public static class ByteArrayHelpers
{
    private static readonly int[] LookupTable =
        Enumerable.Range(0, 256).Select(CountBits).ToArray();

    private static int CountBits(int value)
    {
        int count = 0;
        for (int i=0; i < 8; i++)
        {
           count += (value >> i) & 1;
        }
        return count;
    }

    public static int CountBitsAfterXor(byte[] array)
    {
        int xor = 0;
        foreach (byte b in array)
        {
            xor ^= b;
        }
        return LookupTable[xor];
    }
}

(You could make it an extension method if you really wanted...)

Note the use of byte[] in the CountBitsAfterXor method - you could make it an IEnumerable<byte> for more generality, but iterating over an array (which is known to be an array at compile-time) will be faster. Probably only microscopically faster, but hey, you asked for the fastest way :)

I would almost certainly actually express it as

public static int CountBitsAfterXor(IEnumerable<byte> data)

in real life, but see which works better for you.

Also note the type of the xor variable as an int. In fact, there's no XOR operator defined for byte values, and if you made xor a byte it would still compile due to the nature of compound assignment operators, but it would be performing a cast on each iteration - at least in the IL. It's quite possible that the JIT would take care of this, but there's no need to even ask it to :)

like image 136
Jon Skeet Avatar answered Oct 18 '22 12:10

Jon Skeet


Fastest way would probably be a 256-element lookup table...

int[] lut
{
    /*0x00*/ 0,
    /*0x01*/ 1,
    /*0x02*/ 1,
    /*0x03*/ 2
    ...
    /*0xFE*/ 7,
    /*0xFF*/ 8
}

e.g.

11110000^01010101 = 10100101 -> lut[165] == 4
like image 26
izb Avatar answered Oct 18 '22 11:10

izb


This is more commonly referred to as bit counting. There are literally dozens of different algorithms for doing this. Here is one site which lists a few of the more well known methods. There are even CPU specific instructions for doing this.

Theorectically, Microsoft could add a BitArray.CountSetBits function that gets JITed with the best algorithm for that CPU architecture. I, for one, would welcome such an addition.

like image 45
Brian Gideon Avatar answered Oct 18 '22 11:10

Brian Gideon