Yesterday i had this interview question, which I couldn't fully answer:
Given a function f() = 0 or 1
with a perfect 1:1 distribution, create a function f(n) = 0, 1, 2, ..., n-1
each with probability 1/n
I could come up with a solution for if n is a natural power of 2, ie use f()
to generate the bits of a binary number of k=ln_2 n
. But this obviously wouldn't work for, say, n=5 as this would generate f(5) = 5,6,7
which we do not want.
Does anyone know a solution?
Just roll the dice until a number from 1 to 5 comes. i.e. reroll on a 6. The expected number of throws is low.
Coin tossing becomes physics rather than a random event. It is the human element that makes the process random in that each toss tends to be at a different speed, sent to a different height, launched at a different angle or caught in a different manner.
Some people use an easy-to-remember sequence such as their phone number or the first few digits of pi. Others use long prime numbers such as 937162211. Still others bang like crazy on their keyboard, hoping that the resulting seed will be "random enough."
You can build a rng for the smallest power of two greater than n
as you described. Then whenever this algorithm generates a number larger than n-1
, throw that number away and try again. This is called the method of rejection.
Addition
The algorithm is
Let m = 2^k >= n where k is is as small as possible.
do
Let r = random number in 0 .. m-1 generated by k coin flips
while r >= n
return r
The probability that this loop stops with at most i
iterations is bounded by 1 - (1/2)^i
. This goes to 1 very rapidly: The loop is still running after 30 iterations with probability less than one-billionth.
You can decrease the expected number of iterations with a slightly modified algorithm:
Choose p >= 1
Let m = 2^k >= p n where k is is as small as possible.
do
Let r = random number in 0 .. m-1 generated by k coin flips
while r >= p n
return floor(r / p)
For example if we are trying to generate 0 .. 4 (n = 5) with the simpler algorithm, we would reject 5, 6 and 7, which is 3/8 of the results. With p = 3
(for example), pn = 15
, we'd have m = 16 and would reject only 15, or 1/16 of the results. The price is needing four coin flips rather than 3 and a division op. You can continue to increase p
and add coin flips to decrease rejections as far as you wish.
Another interesting solution can be derived through a Markov Chain Monte Carlo technique, the Metropolis-Hastings algorithm. This would be significantly more efficient if a large number of samples were required but it would only approach the uniform distribution in the limit.
initialize: x[0] arbitrarily
for i=1,2,...,N
if (f() == 1) x[i] = (x[i-1]++) % n
else x[i] = (x[i-1]-- + n) % n
For large N the vector x will contain uniformly distributed numbers between 0 and n. Additionally, by adding in an accept/reject step we can simulate from an arbitrary distribution, but you would need to simulate uniform random numbers on [0,1] as a sub-procedure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With