Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting bits of ones in Byte by time Complexity O(1) C++ code

I've searched an algorithm that counts the number of ones in Byte by time complexity of O(1) and what I found in google:

// C++ implementation of the approach  
#include <bits/stdc++.h> 
using namespace std; 
  
int BitsSetTable256[256]; 
  
// Function to initialise the lookup table  
void initialize()  
{  
  
    // To initially generate the  
    // table algorithmically  
    BitsSetTable256[0] = 0;  
    for (int i = 0; i < 256; i++) 
    {  
        BitsSetTable256[i] = (i & 1) +  
        BitsSetTable256[i / 2];  
    }  
}  
  
// Function to return the count  
// of set bits in n  
int countSetBits(int n)  
{  
    return (BitsSetTable256[n & 0xff] +  
            BitsSetTable256[(n >> 8) & 0xff] +  
            BitsSetTable256[(n >> 16) & 0xff] +  
            BitsSetTable256[n >> 24]);  
}  
  
// Driver code  
int main()  
{  
    // Initialise the lookup table  
    initialize();  
    int n = 9;  
    cout << countSetBits(n); 
}  
  

I understand what I need 256 size of the array (in other words size of the look up table) for indexing from 0 to 255 which they are all the decimals value that Byte represents !

but in the function initialize I didn't understand the terms inside the for loop: BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
Why Im doing that?! I didn't understand what's the purpose of this row code inside the for loop.

In addition , in the function countSetBits , this function returns:

 return (BitsSetTable256[n & 0xff] +  
            BitsSetTable256[(n >> 8) & 0xff] +  
            BitsSetTable256[(n >> 16) & 0xff] +  
            BitsSetTable256[n >> 24]); 

I didn't understand at all what Im doing and bitwise with 0xff and why Im doing right shift .. may please anyone explain to me the concept?! I didn't understand at all why in function countSetBits at BitsSetTable256[n >> 24] we didn't do and wise by 0xff ?

I understand why I need the lookup table with size 2^8 , but the other code rows that I mentioned above didn't understand, could anyone please explain them to me in simple words? and what's purpose for counting the number of ones in Byte?

thanks alot guys!

like image 204
LiamLony Avatar asked Sep 08 '20 12:09

LiamLony


People also ask

How many bits does a byte equal to C code?

A byte is typically 8 bits. C character data type requires one byte of storage. A file is a sequence of bytes.

How many bits are there in C?

The minimum size for char is 8 bits, the minimum size for short and int is 16 bits, for long it is 32 bits and long long must contain at least 64 bits.

What is Setbits?

Set bits in a binary number is represented by 1. Whenever we calculate the binary number of an integer value then it is formed as the combination of 0's and 1's. So, the digit 1 is known as set bit in the terms of the computer.


2 Answers

Concerning the first part of question:

// Function to initialise the lookup table  
void initialize()  
{  
  
    // To initially generate the  
    // table algorithmically  
    BitsSetTable256[0] = 0;  
    for (int i = 0; i < 256; i++) 
    {  
        BitsSetTable256[i] = (i & 1) +  
        BitsSetTable256[i / 2];  
    }  
}  

This is a neat kind of recursion. (Please, note I don't mean "recursive function" but recursion in a more mathematical sense.)

The seed is BitsSetTable256[0] = 0;

Then every element is initialized using the (already existing) result for i / 2 and adds 1 or 0 for this. Thereby,

  • 1 is added if the last bit of index i is 1
  • 0 is added if the last bit of index i is 0.

To get the value of last bit of i, i & 1 is the usual C/C++ bit mask trick.

Why is the result of BitsSetTable256[i / 2] a value to built upon? The result of BitsSetTable256[i / 2] is the number of all bits of i the last one excluded.

Please, note that i / 2 and i >> 1 (the value (or bits) shifted to right by 1 whereby the least/last bit is dropped) are equivalent expressions (for positive numbers in the resp. range – edge cases excluded).


Concerning the other part of the question:

    return (BitsSetTable256[n & 0xff] +  
            BitsSetTable256[(n >> 8) & 0xff] +  
            BitsSetTable256[(n >> 16) & 0xff] +  
            BitsSetTable256[n >> 24]);
  • n & 0xff masks out the upper bits isolating the lower 8 bits.
  • (n >> 8) & 0xff shifts the value of n 8 bits to right (whereby the 8 least bits are dropped) and then again masks out the upper bits isolating the lower 8 bits.
  • (n >> 16) & 0xff shifts the value of n 16 bits to right (whereby the 16 least bits are dropped) and then again masks out the upper bits isolating the lower 8 bits.
  • (n >> 24) & 0xff shifts the value of n 24 bits to right (whereby the 24 least bits are dropped) which should make effectively the upper 8 bits the lower 8 bits.

Assuming that int and unsigned have usually 32 bits on nowadays common platforms this covers all bits of n.

Please, note that the right shift of a negative value is implementation-defined.

(I recalled Bitwise shift operators to be sure.)

So, a right-shift of a negative value may fill all upper bits with 1s.
That can break BitsSetTable256[n >> 24] resulting in (n >> 24) > 256
and hence BitsSetTable256[n >> 24] an out of bound access.

The better solution would've been:

    return (BitsSetTable256[n & 0xff] +  
            BitsSetTable256[(n >> 8) & 0xff] +  
            BitsSetTable256[(n >> 16) & 0xff] +  
            BitsSetTable256[(n >> 24) & 0xff]);
like image 85
Scheff's Cat Avatar answered Sep 23 '22 23:09

Scheff's Cat


BitsSetTable256[0] = 0;
...
BitsSetTable256[i] = (i & 1) +  
     BitsSetTable256[i / 2];  

The above code seeds the look-up table where each index contains the number of ones for the number used as index and works as:

  • (i & 1) gives 1 for odd numbers, otherwise 0.
  • An even number will have as many binary 1 as that number divided by 2.
  • An odd number will have one more binary 1 than that number divided by 2.

Examples:

  • if i==8 (1000b) then (i & 1) + BitsSetTable256[i / 2] ->
    0 + BitsSetTable256[8 / 2] = 0 + index 4 (0100b) = 0 + 1 .
  • if i==7 (0111b) then 1 + BitsSetTable256[7 / 2] = 1 + BitsSetTable256[3] = 1 + index 3 (0011b) = 1 + 2.

If you want some formal mathematical proof why this is so, then I'm not the right person to ask, I'd poke one of the math sites for that.


As for the shift part, it's just the normal way of splitting up a 32 bit value in 4x8, portably without care about endianess (any other method to do that is highly questionable). If we un-sloppify the code, we get this:

BitsSetTable256[(n >>  0) & 0xFFu] +  
BitsSetTable256[(n >>  8) & 0xFFu] +  
BitsSetTable256[(n >> 16) & 0xFFu] +  
BitsSetTable256[(n >> 24) & 0xFFu] ;  

Each byte is shifted into the LS byte position, then masked out with a & 0xFFu byte mask.

Using bit shifts on int is however code smell and potentially buggy. To avoid poorly-defined behavior, you need to change the function to this:

#include <stdint.h>

uint32_t countSetBits (uint32_t n);
like image 34
Lundin Avatar answered Sep 23 '22 23:09

Lundin