I'm trying to create a function that returns a numpy.array
with n
pseudo-random uniformly distributed numbers between 0 and 1. The details of the method used can be found here: https://en.wikipedia.org/wiki/Linear_congruential_generator
So far, it works great. The only problem is that each new value is calculated by using the previous value, so the only solution I've found so far uses a loop, and for the sake of eficiency I'm trying to get rid of that loop, possibly by vectorizing the operation - however, I don't know how to do so.
Do you have any suggestions on how to optimize this function?
import numpy as np
import time
def unif(n):
m = 2**32
a = 1664525
c = 1013904223
result = np.empty(n)
result[0] = int((time.time() * 1e7) % m)
for i in range(1,n):
result[i] = (a*result[i-1]+c) % m
return result / m
Although not vectorized, I believe the following solution is about 2x faster (60x faster with the numba solution). It saves each result
as a local variable instead of accessing the numpy array by location.
def unif_improved(n):
m = 2**32
a = 1664525
c = 1013904223
results = np.empty(n)
results[0] = result = int((time.time() * 1e7) % m)
for i in range(1, n):
result = results[i] = (a * result + c) % m
return results / m
You may also consider using Numba for further efficiencies in speed. https://numba.pydata.org/
Just the addition of the decorator @jit
blows to doors off of the other solutions.
from numba import jit
@jit
def unif_jit(n):
# Same code as `unif_improved`
Timings
>>> %timeit -n 10 unif_original(500000)
715 ms ± 21.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit -n 10 unif_improved(500000)
323 ms ± 8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit -n 10 unif_jit(500000)
12 ms ± 2.68 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
This isn't possible to do completely since the answers depend on each other sequentially. The magic of modular arithmetic does mean that you can get a small improvement gain with the following change (modified from @Alexander's suggestion to use a local variable instead of an array lookup).
def unif_2(n):
m = 2**32
a = 1664525
c = 1013904223
results = np.empty(n)
results[0] = result = int((time.time() * 1e7) % m)
for i in range(1, n):
result = results[i] = (a * result + c)
return results % m / m
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With