Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find number of 1's in a binary number in O(1) time?

I know this has been asked before, but I'm looking at this particular solution listed here:

int BitCount(unsigned int u)
{
     unsigned int uCount;

     uCount = u - ((u >> 1) & 033333333333) - ((u >> 2) & 011111111111);
     return ((uCount + (uCount >> 3)) & 030707070707) % 63;
}

How does it work?

Are there any caveats involved here?

Theoretically is it possible to find the answer in constant time? I mean don't we actually have to iterate through the bits to count?

like image 387
batman Avatar asked Dec 01 '22 02:12

batman


1 Answers

Counting bits

An unsigned 32-bit integer u can be written like this:

u = a31 * 231 + a30 * 230 + ... + a0 * 20

We want the value of a31 + a30 + ... + a0.

Let's compare the values of u >> k:

u >> 0  = a31 * 231 + a30 * 230 + ... + a1 * 21 + a0 * 20
u >> 1  = a31 * 230 + a30 * 229 + ... + a1 * 20
u >> 2  = a31 * 229 + a30 * 228 + ...
...
u >> 29 = a31 * 22  + a29 * 21  + ...
u >> 30 = a31 * 21  + a30 * 20 
u >> 31 = a31 * 20 

We will calculate the bit population by this formula:

u >> 0 - u >> 1 - u >> 2 - ... - u >> 31 = p

Let's see why this works:

  u >> 0 - u >> 1 - u >> 2 - ... - u >> 31
= u >> 0 - (u >> 1 + u >> 2 + ... + u >> 31)
= u - q

What's the value of q? Let's calculate it bit by bit, looking at the values for u >> k above. For a31, it's:

  a31 * 230 + a31 * 229 + ...
= a31 * (230 + 229 + ...)
= a31 * (231 - 1)

Or for a30:

  a30 * 229 + a30 * 228 + ...
= a30 * (229 + 228 + ...)
= a30 * (230 - 1)

We find: q = a31 * (231 - 1) + a30 * (230 - 1) + ...

And thus

u - q = a31 * 231 - a31 * (231 - 1) + ...
      = a31 + a30 + ... + a0

Counting bits in 3-bit blocks

This algorithm starts by doing the same thing, but in blocks of 3 bits:

u >> 0                = AaBbbCccDddEeeFffGggHhhIiiJjjKkk (each letter is a bit)
u >> 1 & 033333333333 =  A Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk (blank = zero)
u >> 2 & 011111111111 =     B  C  D  E  F  G  H  I  J  K

Taking the difference of these, by the above algorithm, each octet in uCount contains the number of bits set in the corresponding octet in u.

uCount      =   αβγδεζηθικλ (each greek letter is an octet)
uCount >> 3 =    αβγδεζηθικ

So uCount + (uCount >> 3) is (λ+κ) * 20 + (κ+ι) * 23 + (ι+θ) * 26 + ...

By ANDing with 0o30707070707, we mask away every other octet, so that we only count each pair once:

r = (λ+κ) *  20 + (ι+θ) *  26 + (η+ζ) *  212 + ...
  = (λ+κ) * 640 + (ι+θ) * 641 + (η+ζ) * 642  + ...

This is a base-64 number, and we want to sum up the base-64 digits to get α+β+γ+δ+ε+ζ+η+θ+ι+κ+λ, our final result. To do this, we calculate its base-64 digital root: knowing that the result can never be bigger than 32, we just modulo the number by 63.

like image 146
Lynn Avatar answered Dec 04 '22 22:12

Lynn