Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given an array of booleans, what is the most efficient way to select the index of a random TRUE value?

You're given an array of size n, containing arbitrary boolean values.

What is the fastest way to return the index of a random TRUE value.

The algorithm should randomly return any one of the indices containing a TRUE.

like image 236
PaulV Avatar asked Mar 23 '12 13:03

PaulV


2 Answers

Something like this:

int count = 0;
int index = -1;
for (int i = 0; i != n; ++i)
{
    if (values[i])
    {
        ++count;
        if (unit_random <= 1.0f / count)
        {
            index = i;
        }
    }
}

So for 4 values for example you get the following probabilities for their indices:

1: (1 / 1) * (1 / 2) * (2 / 3) * (3 / 4) = 1 / 4 
2: (1 / 2) * (2 / 3) * (3 / 4) = 1 / 4
3: (1 / 3) * (3 / 4) = 1 / 4
4: 1 / 4 = 1 / 4

EDIT: As Steve Jessop pointed out the floating point comparision will eventually lead to a very non uniform selection. Assuming unit_random is defined as rand() / RAND_MAX the comparision can be changed to:

typedef unsigned long long u64;
u64 product = u64(count) * rand();
if (product <= u64(RAND_MAX))

This won't give perfect distribution due to the discrete nature of rand but it will be better.

like image 63
Andreas Brinck Avatar answered Sep 30 '22 12:09

Andreas Brinck


The quickest solution - assuming you don't select repeatedly on the same array - is to pick a random index, return it if it's true, and repeat if not. In the best case, where all entries are true, this is O(1); in the worst case, where only one entry is true, this is O(n) (each try has a 1/n chance of hitting the only true value, which means an expected number of tries of n). This is no worse than any of the other posted solutions.

If you expect the array to usually be almost all false, though, you may want to pick another solution, as the variance in runtime of this random method will be high.

like image 36
Nick Johnson Avatar answered Sep 30 '22 14:09

Nick Johnson