You're given an array of size n, containing arbitrary boolean values.
What is the fastest way to return the index of a random TRUE value.
The algorithm should randomly return any one of the indices containing a TRUE.
Something like this:
int count = 0;
int index = -1;
for (int i = 0; i != n; ++i)
{
if (values[i])
{
++count;
if (unit_random <= 1.0f / count)
{
index = i;
}
}
}
So for 4 values for example you get the following probabilities for their indices:
1: (1 / 1) * (1 / 2) * (2 / 3) * (3 / 4) = 1 / 4
2: (1 / 2) * (2 / 3) * (3 / 4) = 1 / 4
3: (1 / 3) * (3 / 4) = 1 / 4
4: 1 / 4 = 1 / 4
EDIT: As Steve Jessop pointed out the floating point comparision will eventually lead to a very non uniform selection. Assuming unit_random
is defined as rand() / RAND_MAX
the comparision can be changed to:
typedef unsigned long long u64;
u64 product = u64(count) * rand();
if (product <= u64(RAND_MAX))
This won't give perfect distribution due to the discrete nature of rand
but it will be better.
The quickest solution - assuming you don't select repeatedly on the same array - is to pick a random index, return it if it's true, and repeat if not. In the best case, where all entries are true, this is O(1); in the worst case, where only one entry is true, this is O(n) (each try has a 1/n chance of hitting the only true value, which means an expected number of tries of n). This is no worse than any of the other posted solutions.
If you expect the array to usually be almost all false, though, you may want to pick another solution, as the variance in runtime of this random method will be high.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With