While writing a script I discovered the numpy.random.choice function. I implemented it because it was was much cleaner than the equivalent if statement. However, after running the script I realized it is significantly slower than the if statement.
The following is a MWE. The first method takes 0.0 s, while the second takes 7.2 s. If you scale up the i loop, you will see how fast random.choice slows down.
Can anyone comment on why random.choice is so much slower?
import numpy as np
import numpy.random as rand
import time as tm
#-------------------------------------------------------------------------------
tStart = tm.time()
for i in xrange(100):
for j in xrange(1000):
tmp = rand.rand()
if tmp < 0.25:
var = 1
elif tmp < 0.5:
var = -1
print('Time: %.1f s' %(tm.time() - tStart))
#-------------------------------------------------------------------------------
tStart = tm.time()
for i in xrange(100):
for j in xrange(1000):
var = rand.choice([-1, 0, 1], p = [0.25, 0.5, 0.25])
print('Time: %.1f s' %(tm.time() - tStart))
So, for a single random number, NumPy is significantly slower.
Generating a random floatGenerating a single random float is 10x faster using using Python's built-in random module compared to np. random . with NumPy than with base python. So if you need to generate a single random number—or less than 10 numbers—it is faster to simply loop over random.
Indeed, whenever we call a python function, such as np. random. rand() the output can only be deterministic and cannot be truly random. Hence, numpy has to come up with a trick to generate sequences of numbers that look like random and behave as if they came from a purely random source, and this is what PRNG are.
NumPy is fast because it can do all its calculations without calling back into Python. Since this function involves looping in Python, we lose all the performance benefits of using NumPy. For a 10,000,000-entry NumPy array, this functions takes 2.5 seconds to run on my computer.
You're using it wrong. Vectorize the operation, or numpy will offer no benefit:
var = numpy.random.choice([-1, 0, 1], size=1000, p=[0.25, 0.5, 0.25])
Timing data:
>>> timeit.timeit('''numpy.random.choice([-1, 0, 1],
... size=1000,
... p=[0.25, 0.5, 0.25])''',
... 'import numpy', number=10000)
2.380380242513752
>>> timeit.timeit('''
... var = []
... for i in xrange(1000):
... tmp = rand.rand()
... if tmp < 0.25:
... var.append(1)
... elif tmp < 0.5:
... var.append(-1)
... else:
... var.append(0)''',
... setup='import numpy.random as rand', number=10000)
5.673041396894519
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With