how can I use np.random.choice here?
there is p
that calculate by some opertation, like :
p=[ 1.42836755e-01, 1.42836735e-01 , 1.42836735e-01, 1.42836735e-01
, 4.76122449e-05, 1.42836735e-01 , 4.76122449e-05 , 1.42836735e-01,
1.42836735e-01, 4.76122449e-05]
usually sum p is not exact equal to 1:
>>> sum(p)
1.0000000017347
I want to make random choice by probabilities=p:
>>> np.random.choice([1,2,3,4,5,6,7,8,9, 10], 4, p=p, replace=False)
array([4, 3, 2, 9])
this work here! but in the program it has an error :
Traceback (most recent call last):
indexs=np.random.choice(range(len(population)), population_number, p=p, replace=False)
File "mtrand.pyx", line 1141, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:17808)
ValueError: probabilities do not sum to 1
if I print the p
:
[ 4.17187500e-05 2.49937500e-01 4.16562500e-05 4.16562500e-05
2.49937500e-01 4.16562500e-05 4.16562500e-05 4.16562500e-05
2.49937500e-01 2.49937500e-01]
but it works, in python shell by this p
:
>>> p=[ 4.17187500e-05 , 2.49937500e-01 ,4.16562500e-05 , 4.16562500e-05,
2.49937500e-01 , 4.16562500e-05 , 4.16562500e-05 , 4.16562500e-05,
2.49937500e-01 ,2.49937500e-01]
>>> np.random.choice([1,2,3,4,5,6,7,8,9, 10], 4, p=p, replace=False)
array([ 9, 10, 2, 5])
UPDATE I have tested it by precision=15:
np.set_printoptions(precision=15)
print(p)
[ 2.499375625000002e-01 2.499375000000000e-01 2.499375000000000e-01
4.165625000000000e-05 4.165625000000000e-05 4.165625000000000e-05
4.165625000000000e-05 4.165625000000000e-05 2.499375000000000e-01
4.165625000000000e-05]
testing:
>>> p=np.array([ 2.499375625000002e-01 ,2.499375000000000e-01 ,2.499375000000000e-01,
4.165625000000000e-05 ,4.165625000000000e-05, 4.165625000000000e-05,
4.165625000000000e-05 , 4.165625000000000e-05 , 2.499375000000000e-01,
4.165625000000000e-05])
>>> np.sum(p)
1.0000000000000002
how fix this to use np.random.choice ?
The NumPy random choice() function generate random samples which are commonly used in data statistics, data analysis, data-related fields, and all and also can be used in probability, machine learning, Bayesian statistics, and all.
choice() function is used to get random elements from a NumPy array. It is a built-in function in the NumPy package of python. Parameters: a: a one-dimensional array/list (random sample will be generated from its elements) or an integer (random samples will be generated in the range of this integer)
NumPy random. choice() function in Python is used to return a random sample from a given 1-D array. It creates an array and fills it with random samples. It has four parameters and using these parameters we can manipulate the random samples of an array.
Convert it to float64:
p = np.asarray(p).astype('float64')
p = p / np.sum(p)
np.random.choice([1,2,3,4,5,6,7,8,9, 10], 4, p=p, replace=False)
This was inspired by another post: How can I avoid value errors when using numpy.random.multinomial?
This is a known issue with numpy. The random choice function checks for the sum of the probabilities using a given tolerance (here the source)
The solution is to normalize the probabilities by dividing them by their sum if the sum is close enough to 1
Example:
>>> p=[ 1.42836755e-01, 1.42836735e-01 , 1.42836735e-01, 1.42836735e-01
, 4.76122449e-05, 1.42836735e-01 , 4.76122449e-05 , 1.42836735e-01,
1.42836735e-01, 4.79122449e-05]
>>> sum(p)
1.0000003017347 # over tolerance limit
>>> np.random.choice([1,2,3,4,5,6,7,8,9, 10], 4, p=p, replace=False)
Traceback (most recent call last):
File "<pyshell#23>", line 1, in <module>
np.random.choice([1,2,3,4,5,6,7,8,9, 10], 4, p=p, replace=False)
File "mtrand.pyx", line 1417, in mtrand.RandomState.choice (numpy\random\mtrand\mtrand.c:15985)
ValueError: probabilities do not sum to 1
With normalization:
>>> p = np.array(p)
>>> p /= p.sum() # normalize
>>> np.random.choice([1,2,3,4,5,6,7,8,9, 10], 4, p=p, replace=False)
array([8, 4, 1, 6])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With