I was looking into how Java counts bits set of an int.
I had in my mind something simple like this (which I think is correct):
public static int bitCount(int number){
final int MASK = 0x1;
int count = 0;
for(int i = 0; i < 32; i++){
if(((number >>> i) & MASK) == MASK){
count++;
}
}
return count;
}
Instead I found a method that I absolutely have no idea what is doing (seems like magic to me):
i = i - ((i >>> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
i = (i + (i >>> 4)) & 0x0f0f0f0f;
i = i + (i >>> 8);
i = i + (i >>> 16);
return i & 0x3f;
Could someone help understand what this does and why a simple function like the one I came up as first thought is/could be bad?
Bit manipulation is the direct manipulation of data bits to perform operations and is an important optimization skill now tested by FAANG recruiters.
Bitwise AND (&) It is a binary operator denoted by the symbol &. It returns 1 if and only if both bits are 1, else returns 0.
Java supports 3-bit shift and 4 bitwise operators to perform operations at the bit level. These operators can be used on integral types (int, short, long and byte) to perform operations at the bit level.
Bit manipulation is the act of algorithmically manipulating bits or other pieces of data shorter than a word. Computer programming tasks that require bit manipulation include low-level device control, error detection and correction algorithms, data compression, encryption algorithms, and optimization.
Sure
i = i - ((i >>> 1) & 0x55555555);
The bit pattern of 5 is 0101
(four bits), so the mask is a repetition of the bit pattern 01
sixteen times. This line counts the number of bits in every two-bit pair of the number. If you consider two-bit pairs, (i >>> 1) & 0x1
gets the high-order bit in the low-order position. Now, there are four possible two-bit patterns
00 ~> 00 - 00 = 00
01 ~> 01 - 00 = 01
10 ~> 10 - 01 = 01
11 ~> 11 - 01 = 10
and each two-bit pair now contains the number of bits the original had in those positions.
i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
Next, we count the bits that were in each four-bit group (aka nibble). By masking a nibble with 0x3 = 0011(b)
, we get the count of bits that were in the lower order two bits of the nibble, and (i >>> 2) & 0x3
gets the count from the higher order two bits of the nibble. These counts are now added. Since each count was at most 2, the sum is at most 4, hence doesn't leave the nibble.
i = (i + (i >>> 4)) & 0x0f0f0f0f;
Now the bits in each octet are counted. Each nibble contains the count of bits set in the original in that place. Shifting right by four bits moves the count from the higher order nibble in each place to the lower order nibble, these are added. Then we also moved the count from the lower order nibble to the higher order nible of the adjacent octet, that is masked out by the & 0x0f0f0f0f
. Since each octet can have at most eight bits set, the count doesn't leave the lower order nibble of the octet.
i = i + (i >>> 8);
Now we add the counts of the pairs of adjacent octets.
i = i + (i >>> 16);
Now we add the totals of the counts in the high order two octets and the low order two.
return i & 0x3f;
Finally, all octets except the lowest order octet are masked out, since the higher order octets still contained intermediate counts.
The reason why your simple implementation might be considered bad is performance. The convoluted bit-hack uses fewer operations and no branches, so it is faster. It is, however, far easier to get wrong.
Another neat way to count the set bits in an int
(or long
) is
public static int bitCount(int n) {
int count = 0;
while(n != 0) {
++count;
n = n & (n-1);
}
return count;
}
That works because n = n & (n-1)
clears the last set bit in n
and leaves everything else untouched. If the bit pattern of n
ends with
...100...0
the bit pattern of n-1
is
...011...1
and the bits before the last 1-bit in n
are the same. In Java, that is guaranteed to work also for negative numbers because integer overflow has wrap-around semantics, so if n = Integer.MIN_VALUE
, the bit pattern is 100...0
and n - 1
becomes Integer.MAX_VALUE
with the bit pattern 011...1
.
the cool method only works since the computer has a finite range of values for int. it won't work (and so other cool bits algorithms) so easily for infinite range (like BigInteger) since you need to know all of the needed masks before the calculation .
in any case , you can read how it works via : http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
it's at the bottom of this chapter.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With