Creating a random number generator from a coin toss

Tags:

random

Yesterday i had this interview question, which I couldn't fully answer:

Given a function f() = 0 or 1 with a perfect 1:1 distribution, create a function f(n) = 0, 1, 2, ..., n-1 each with probability 1/n

I could come up with a solution for if n is a natural power of 2, ie use f() to generate the bits of a binary number of k=ln_2 n. But this obviously wouldn't work for, say, n=5 as this would generate f(5) = 5,6,7 which we do not want.

Does anyone know a solution?

988

asked Nov 03 '12 12:11

bountiful

2 Answers

You can build a rng for the smallest power of two greater than n as you described. Then whenever this algorithm generates a number larger than n-1, throw that number away and try again. This is called the method of rejection.

Addition

The algorithm is

Let m = 2^k >= n where k is is as small as possible.
do
   Let r = random number in 0 .. m-1 generated by k coin flips
while r >= n
return r

The probability that this loop stops with at most i iterations is bounded by 1 - (1/2)^i. This goes to 1 very rapidly: The loop is still running after 30 iterations with probability less than one-billionth.

You can decrease the expected number of iterations with a slightly modified algorithm:

Choose p >= 1
Let m = 2^k >= p n where k is is as small as possible.
do
   Let r = random number in 0 .. m-1 generated by k coin flips
while r >= p n
return floor(r / p)

For example if we are trying to generate 0 .. 4 (n = 5) with the simpler algorithm, we would reject 5, 6 and 7, which is 3/8 of the results. With p = 3 (for example), pn = 15, we'd have m = 16 and would reject only 15, or 1/16 of the results. The price is needing four coin flips rather than 3 and a division op. You can continue to increase p and add coin flips to decrease rejections as far as you wish.

185

answered Oct 23 '22 13:10

Gene

Another interesting solution can be derived through a Markov Chain Monte Carlo technique, the Metropolis-Hastings algorithm. This would be significantly more efficient if a large number of samples were required but it would only approach the uniform distribution in the limit.

 initialize: x[0] arbitrarily    
 for i=1,2,...,N
  if (f() == 1) x[i] = (x[i-1]++) % n
  else x[i] = (x[i-1]-- + n) % n

For large N the vector x will contain uniformly distributed numbers between 0 and n. Additionally, by adding in an accept/reject step we can simulate from an arbitrary distribution, but you would need to simulate uniform random numbers on [0,1] as a sub-procedure.

answered Oct 23 '22 14:10

fairidox

Related questions
                            
                                Implementing a Hilbert map of the Internet
                            
                                How to find a binary logarithm very fast? (O(1) at best)
                            
                                Comparing SIFT features stored in a mysql database
                            
                                Efficient data structure for a leaderboard, i.e., a list of records (name, points) - Efficient Search(name), Search(rank) and Update(points)
                            
                                Algorithm for equiprobable random square binary matrices with two non-adjacent non-zeros in each row and column
                            
                                Find paths in a binary search tree summing to a target value
                            
                                Tries versus ternary search trees for autocomplete?
                            
                                Dispersing n points uniformly on a sphere
                            
                                Why is the complexity of BFS O(V+E) instead of O(V*E)?
                            
                                std algorithms with pointer to member as comparator/"key"
                            
                                Finding the "best" combination for a set
                            
                                algorithm to find longest non-overlapping sequences
                            
                                Preserve code readability while optimising
                            
                                How to delete items from a std::vector given a list of indices
                            
                                Am I implementing the "Heapify" Algorithm correctly?
                            
                                Data Structure for Subsequence Queries
                            
                                Generate unique colours
                            
                                The minimum number of coins the sum of which is S
                            
                                Learning efficient algorithms
                            
                                Implementing python slice notation

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With