Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unbiased random number generator using a biased one

You have a biased random number generator that produces a 1 with a probability p and 0 with a probability (1-p). You do not know the value of p. Using this make an unbiased random number generator which produces 1 with a probability 0.5 and 0 with a probability 0.5.

Note: this problem is an exercise problem from Introduction to Algorithms by Cormen, Leiserson, Rivest, Stein.(clrs)

like image 527
Rohit Banga Avatar asked Dec 31 '09 19:12

Rohit Banga


People also ask

Is there bias in random number generator?

Curiously, some popular random-number generators fail even in simulating a coin toss. Over time, they should produce roughly the same number of zeros and ones (heads and tails). Instead, random-number generators that are often used to produce such sequences tend to cluster zeros together, introducing a bias.

How does Django generate random numbers?

randint() method is used to generate a random number between the start and stop.


2 Answers

The events (p)(1-p) and (1-p)(p) are equiprobable. Taking them as 0 and 1 respectively and discarding the other two pairs of results you get an unbiased random generator.

In code this is done as easy as:

int UnbiasedRandom()
{
    int x, y;

    do
    {
        x = BiasedRandom();
        y = BiasedRandom();
    } while (x == y);

    return x;
}
like image 187
pau.estalella Avatar answered Sep 28 '22 12:09

pau.estalella


The procedure to produce an unbiased coin from a biased one was first attributed to Von Neumann (a guy who has done enormous work in math and many related fields). The procedure is super simple:

  • Toss the coin twice.
  • If the results match, start over, forgetting both results.
  • If the results differ, use the first result, forgetting the second.

The reason this algorithm works is because the probability of getting HT is p(1-p), which is the same as getting TH (1-p)p. Thus two events are equally likely.

I am also reading this book and it asks the expected running time. The probability that two tosses are not equal is z = 2*p*(1-p), therefore the expected running time is 1/z.


The previous example looks encouraging (after all, if you have a biased coin with a bias of p=0.99, you will need to throw your coin approximately 50 times, which is not that many). So you might think that this is an optimal algorithm. Sadly it is not.

Here is how it compares with the Shannon's theoretical bound (image is taken from this answer). It shows that the algorithm is good, but far from optimal.

enter image description here

You can come up with an improvement if you will consider that HHTT will be discarded by this algorithm, but in fact it has the same probability as TTHH. So you can also stop here and return H. The same is with HHHHTTTT and so on. Using these cases improves the expected running time, but are not making it theoretically optimal.


And in the end - python code:

import random

def biased(p):
    # create a biased coin
    return 1 if random.random() < p else 0

def unbiased_from_biased(p):
    n1, n2 = biased(p), biased(p)
    while n1 == n2:
        n1, n2 = biased(p), biased(p)

    return n1

p = random.random()
print p

tosses = [unbiased_from_biased(p) for i in xrange(1000)]
n_1 = sum(tosses)
n_2 = len(tosses) - n_1
print n_1, n_2

It is pretty self-explanatory, and here is an example result:

0.0973181652114
505 495

As you see, nonetheless we had a bias of 0.097, we got approximately the same number of 1 and 0

like image 21
Salvador Dali Avatar answered Sep 28 '22 13:09

Salvador Dali