Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python random not working like

Attempted problem: The probability that one of two dice will have a higher value than a third die.

Problem: For some reason, when I use the random module from python (specifically the sample method), I end up with a different (and incorrect) result from when when I use numpy. I've included the results at the bottom. Repeated execution of the code yields similar results. Any ideas, why the random.sample method and the numpy.random.random_integers have different results even though they have the same function?

import numpy as np                                                              
import random                                                                   


random_list = []                                                                
numpy_list = []                                                                 
n= 500                                                                          
np_wins = 0                                                                     
rand_wins = 0                                                                   
for i in range(n):                                                              
    rolls = random.sample(range(1,7), 3)                                        
    rand_wins += any(rolls[0] < roll for roll in rolls)                         

    rolls = np.random.random_integers(1, 6, 3)                                  
    np_wins += any(rolls[0] < roll for roll in rolls)                           


print "numpy : {}".format(np_wins/(n * 1.0))                                    
print "random : {}".format(rand_wins/(n * 1.0))           

Result:


Press ENTER or type command to continue
numpy : 0.586
random : 0.688

like image 283
Preom Avatar asked Jul 16 '14 10:07

Preom


3 Answers

The reason for the observed difference is that random.sample samples without replacement (see here), while numpy.random.random_integers samples with replacement.

like image 126
rroowwllaanndd Avatar answered Oct 11 '22 21:10

rroowwllaanndd


random.sample() prevents double values. It is like drawing numbers without replacing them, so a result like [ 1, 1, 1 ] will never occur.

np.random.random_integers() on the other hand is what you really want if you simulate three die rolls.

You can replace your random.sample() by sth like [ random.randint(1, 6) for _ in range(3) ] to achieve the same result.

like image 35
Alfe Avatar answered Oct 11 '22 21:10

Alfe


Two problems here (one minor, one significant):

  1. Your sample size is very small to get a good result. If I do only 500 rolls, I get a result between 0.55 and 0.62. Hardly accurate.

  2. random.sample picks 3 items without putting them back from the given sequence. So you're not doing three dice rolls, you're picking three distinct numbers from the range [1, 6].

    In fact, if I do that, the probability is 67 %, whereas for the problem you stated it's more around 58 %, as you observed.

PowerShell test code I used:

Original problem statement:

(1..500 | %{
   $r = 0..2 | %{ Get-Random -min 1 -max 7 }
   !!($r|?{$r[0] -lt $_})
} | measure -ave).Average

Your flawed method:

(1..500 | %{
   $r = 1..6 | Get-Random 3
   !!($r|?{$r[0] -lt $_})
} | measure -ave).Average

Those yield the same result difference you observed.

like image 39
Joey Avatar answered Oct 11 '22 22:10

Joey