Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does this bit manipulation work in Java?

I was looking into how Java counts bits set of an int.
I had in my mind something simple like this (which I think is correct):

public static int bitCount(int number){  
        final int MASK = 0x1;  
        int count = 0;  

        for(int i = 0; i < 32; i++){  
            if(((number >>> i) & MASK) == MASK){  
                count++;  
            }  
        }  
        return count;  
    }  

Instead I found a method that I absolutely have no idea what is doing (seems like magic to me):

 i = i - ((i >>> 1) & 0x55555555);  
 i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);  
 i = (i + (i >>> 4)) & 0x0f0f0f0f;  
 i = i + (i >>> 8);  
 i = i + (i >>> 16);  
 return i & 0x3f;  

Could someone help understand what this does and why a simple function like the one I came up as first thought is/could be bad?

like image 611
Cratylus Avatar asked Jun 03 '12 21:06

Cratylus


People also ask

What is bit manipulation in Java?

Bit manipulation is the direct manipulation of data bits to perform operations and is an important optimization skill now tested by FAANG recruiters.

How bitwise and works in Java?

Bitwise AND (&) It is a binary operator denoted by the symbol &. It returns 1 if and only if both bits are 1, else returns 0.

Does Java have bit manipulation?

Java supports 3-bit shift and 4 bitwise operators to perform operations at the bit level. These operators can be used on integral types (int, short, long and byte) to perform operations at the bit level.

What is bit manipulation used for?

Bit manipulation is the act of algorithmically manipulating bits or other pieces of data shorter than a word. Computer programming tasks that require bit manipulation include low-level device control, error detection and correction algorithms, data compression, encryption algorithms, and optimization.


2 Answers

Sure

i = i - ((i >>> 1) & 0x55555555);

The bit pattern of 5 is 0101 (four bits), so the mask is a repetition of the bit pattern 01 sixteen times. This line counts the number of bits in every two-bit pair of the number. If you consider two-bit pairs, (i >>> 1) & 0x1 gets the high-order bit in the low-order position. Now, there are four possible two-bit patterns

00 ~> 00 - 00 = 00
01 ~> 01 - 00 = 01
10 ~> 10 - 01 = 01
11 ~> 11 - 01 = 10

and each two-bit pair now contains the number of bits the original had in those positions.

i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);

Next, we count the bits that were in each four-bit group (aka nibble). By masking a nibble with 0x3 = 0011(b), we get the count of bits that were in the lower order two bits of the nibble, and (i >>> 2) & 0x3 gets the count from the higher order two bits of the nibble. These counts are now added. Since each count was at most 2, the sum is at most 4, hence doesn't leave the nibble.

i = (i + (i >>> 4)) & 0x0f0f0f0f;

Now the bits in each octet are counted. Each nibble contains the count of bits set in the original in that place. Shifting right by four bits moves the count from the higher order nibble in each place to the lower order nibble, these are added. Then we also moved the count from the lower order nibble to the higher order nible of the adjacent octet, that is masked out by the & 0x0f0f0f0f. Since each octet can have at most eight bits set, the count doesn't leave the lower order nibble of the octet.

i = i + (i >>> 8);

Now we add the counts of the pairs of adjacent octets.

i = i + (i >>> 16);

Now we add the totals of the counts in the high order two octets and the low order two.

return i & 0x3f;

Finally, all octets except the lowest order octet are masked out, since the higher order octets still contained intermediate counts.

The reason why your simple implementation might be considered bad is performance. The convoluted bit-hack uses fewer operations and no branches, so it is faster. It is, however, far easier to get wrong.

Another neat way to count the set bits in an int (or long) is

public static int bitCount(int n) {
    int count = 0;
    while(n != 0) {
        ++count;
        n = n & (n-1);
    }
    return count;
}

That works because n = n & (n-1) clears the last set bit in n and leaves everything else untouched. If the bit pattern of n ends with

...100...0

the bit pattern of n-1 is

...011...1

and the bits before the last 1-bit in n are the same. In Java, that is guaranteed to work also for negative numbers because integer overflow has wrap-around semantics, so if n = Integer.MIN_VALUE, the bit pattern is 100...0 and n - 1 becomes Integer.MAX_VALUE with the bit pattern 011...1.

like image 92
Daniel Fischer Avatar answered Nov 01 '22 15:11

Daniel Fischer


the cool method only works since the computer has a finite range of values for int. it won't work (and so other cool bits algorithms) so easily for infinite range (like BigInteger) since you need to know all of the needed masks before the calculation .

in any case , you can read how it works via : http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

it's at the bottom of this chapter.

like image 34
android developer Avatar answered Nov 01 '22 16:11

android developer