Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count bitmasks, enumerate 0s

Tags:

c++

python

c

binary

I had the following question in an interview and I believe I gave a working implementation but I was wondering if there was a better implementation that was quicker, or just a trick I missed.

Given 3 unsigned 30-bit integers, return the number of 30-bit integers that when compared with any of the original numbers have the same position bits set to 1. That is we enumerate all the 0s

Let me give an example, but lets use 4bit for clarity.

Given:

A = 1001
B = 0011
C = 0110

It should return 8 since there are 8 4bit ints that are in the set. The set being:

0011
0110
0111
1001
1011
1101
1110
1111

Now how I worked it out was to take each number and enumerate the set of possibilites and then count all the distinct values. How I enumerated the set was to start with the number, add one to it and then OR it with itself until I reached the mask. With the number itself being in the set and the mask (being all set to 1) also in the set. So for example to enumerate the set of 1001:

1001 = the start
1011 = (1001 + 1) | 1001
1101 = (1011 + 1) | 1001
1111 = (1101 + 1) | 1001 (this is the last value as we have reached our mask)

So do that for each number and then count the uniques.

This is it in python code (but language doesn't really matter as long as you can do bitwise operations, hence why this question is tagged for c/c++ as well):

MASK = 0x3FFFFFFF

def count_anded_bitmasks( A, B, C ):
    andSets = set(
      enumerate_all_subsets(A) +
      enumerate_all_subsets(B) +
      enumerate_all_subsets(C)
    )
    return len(andSets)


def enumerate_all_subsets( d ):
    andSet = []
    n = d
    while n != MASK:
        andSet.append(n)
        n = (n + 1) | d
    andSet.append(n)
    return andSet

Now this works and gives the correct answer but I'm wondering if I have missed a trick. Since the question was to only ask the count and not enumerate all the values perhaps there is a much quicker way. Either by combining the numbers first, or getting a count without enumeration. I have a feeling there is. Since numbers that contain lots of zeros the enumeration rises exponentially and it can take quite a while.

If you have A B and C, the count of the set of numbers which has bits set to 1 where A or B or C has corresponding bits set to 1.

Some people don't understand the question (didn't help that I didn't ask it correctly first of). Let's use the given A B and C values above:

A:

1001
1011
1101
1111

B:

0011
0111
1011
1111

C:

0110
0111
1110
1111

Now combine those sets and count the distinct entries. That is the answer. Is there a way to do this without enumerating the values?

edit: Sorry for the mistake the question. Fixed now.

like image 369
dalore Avatar asked May 01 '12 16:05

dalore


1 Answers

EDIT: updated requirement: Given 3 unsigned 30-bit integers, return the number of 30-bit integers that when compared with any of the original numbers have the same position bits set to 1. That is we enumerate all the 0s

OK that's a bit tougher. It's easy to calculate for one number, since in that case the number of possible integers depends only on the number of zero bits, like this:

// Count bits not set
const size_t NUMBITS=30;
size_t c;
size_t v = num;
for (c=NUMBITS; v; v >>= 1) 
  c -= v & 1;

return c; 

Naively you might try extending this to three integers by doing it for each and summing the results, however that would be wrong because the possibilities need to be unique, e.g. given

A = 1001
B = 0011
C = 0110

You would count e.g. 1111 three times rather than once. You should subtract the number of combinations that are shared between any two numbers, but not subtract any combination twice. This is simply a C++ translation of Winston Ewert's answer!

size_t zeroperms(size_t v)
{
    // Count number of 0 bits
    size_t c = 1;
    for (c=NUMBITS; v; v >>= 1)
        c -= v & 1;
    // Return number of permutations possible with those bits
    return 1 << c;
}

size_t enumerate(size_t a, size_t b, size_t c)
{
    size_t total = zeroperms(a) + zeroperms(b) + zeroperms(c);
    total -= zeroperms(a | b); // counted twice, remove one
    total -= zeroperms(b | c); // counted twice, remove one
    total -= zeroperms(a | c); // counted twice, remove one
    total += zeroperms(a | b | c); // counted three times, removed three times, add one
    return total;
}
like image 136
smocking Avatar answered Sep 25 '22 01:09

smocking