Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expand random range of integers using probability distribution

I tried to solve the classic problem of generating a random integer between 1 and 7, given a function that generates a random integer between 1 and 5. My approach was to add the result of 2 calls to rand5(), effectively turning this into a "sum of dice rolls" problem. The probability of the sum dice rolls is fairly easy to calculate, so I used it here. An explanation of this is after the code

My question is: How can I calculate what the values of counter should be? The current values are incorrect, as verified by experiment. Do integer values exist that satisfy the probability? And is there a better way to solve this problem with this approach?

def rand5():
    return random.randint(1,5)

def rand7():
    counter = [1,2,3,4,5,4,3]
    while 0 not in counter:
        sum = rand5() + rand5() - 2
        if sum <= 6:
            counter[sum] -= 1
    return counter.index(0) + 1

For reference, the following code appears to create a random distribution.

test_counter = [0,0,0,0,0,0,0,0,0,0,0,0]
for i in range(500000):
    test_counter[rand5() + rand5() - 2] += 1

test_counter[0] *= 60
test_counter[1] *= 30
test_counter[2] *= 20
test_counter[3] *= 15
test_counter[4] *= 12
test_counter[5] *= 15
test_counter[6] *= 20
test_counter[7] *= 0
test_counter[8] *= 0

print(test_counter)

Explanation of probability: The probability of the dice rolls can be calculated by listing the possible combinations of dice. For this problem, the numbers generated by each die (the rand5 function) would be:

{(1,1), (1,2), (1,3), (1,4), (1,5), (2,1), (2,2), ..., (5,5)}

The probability of each sum is the number of ways the sum appears in the list, divided by the total number of items in the list. The list has 5^2 = 25 total elements. For example, a sum of 4 can be achieved by the following combinations {(1,3), (2,2), (3,1)}, so the probability of a sum of 4 is 3/25.

The probability of each result is then:

  1. 1/25
  2. 2/25
  3. 3/25
  4. 4/25
  5. 5/25
  6. 4/25
  7. 3/25
  8. 2/25
  9. 1/25

I tried to use this distribution to generate a uniform distribution by having the more common ones have to be generated multiple times, and this is stored in counter.

like image 811
The Bic Pen Avatar asked Dec 17 '18 21:12

The Bic Pen


1 Answers

Not sure going via dice distribution is good idea. In general, if you have random bits source, but short sequences, better combine and chop bits to make up longer random bits sequence. Along the lines

import random

def rand5():
    return random.randint(1, 5)

def twobits():
    q = rand5() - 1 # [0...5) range
    while q == 4: # dropping high bit
        q = rand5() - 1
    return q # [0...3) range, two random bits

def onebit():
    return twobits() & 1

def rand7():
    q = onebit() << 2 | twobits() # here we have [0...8) range
    while q == 0:                 # and dropping 0
        q = onebit() << 2 | twobits()
    return q

counter = 8*[0]

for i in range(500000):
    counter[rand7()] += 1

print(counter)

produced uniform in [1...8) sampling

[0, 71592, 71352, 71071, 71543, 71600, 71388, 71454]

Take two bits from one sample, one bit from another sample, combine them, some rejection and voila!

like image 81
Severin Pappadeux Avatar answered Oct 07 '22 07:10

Severin Pappadeux