How can I generate non-repetitive random numbers in numpy?
list = np.random.random_integers(20,size=(10))
Simply generate an array that contains the required range of numbers, then shuffle them by repeatedly swapping a random one with the 0th element in the array. This produces a random sequence that doesn't contain duplicate values. Another property of the resulting random sequence is that it is not particularly random.
numpy.random.Generator.choice
offers a replace
argument to sample without replacement:
from numpy.random import default_rng rng = default_rng() numbers = rng.choice(20, size=10, replace=False)
If you're on a pre-1.17 NumPy, without the Generator
API, you can use random.sample()
from the standard library:
print(random.sample(range(20), 10))
You can also use numpy.random.shuffle()
and slicing, but this will be less efficient:
a = numpy.arange(20) numpy.random.shuffle(a) print a[:10]
There's also a replace
argument in the legacy numpy.random.choice
function, but this argument was implemented inefficiently and then left inefficient due to random number stream stability guarantees, so its use isn't recommended. (It basically does the shuffle-and-slice thing internally.)
Some timings:
import timeit print("when output size/k is large, np.random.default_rng().choice() is far far quicker, even when including time taken to create np.random.default_rng()") print(1, timeit.timeit("rng.choice(a=10**5, size=10**4, replace=False, shuffle=False)", setup="import numpy as np; rng=np.random.default_rng()", number=10**3)) #0.16003450006246567 print(2, timeit.timeit("np.random.default_rng().choice(a=10**5, size=10**4, replace=False, shuffle=False)", setup="import numpy as np", number=10**3)) #0.19915290002245456 print(3, timeit.timeit("random.sample( population=range(10**5), k=10**4)", setup="import random", number=10**3)) #5.115292700007558 print("when output size/k is very small, random.sample() is quicker") print(4, timeit.timeit("rng.choice(a=10**5, size=10**1, replace=False, shuffle=False)", setup="import numpy as np; rng=np.random.default_rng()", number=10**3)) #0.01609779999125749 print(5, timeit.timeit("random.sample( population=range(10**5), k=10**1)", setup="import random", number=10**3)) #0.008387799956835806
So numpy.random.Generator.choice
is what you usually want to go for, except for very small output size/k
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With