I am writing a cython application where I need to generate a Gaussian random variable on-the-fly in a tight nested loop. I would like to do this without introducing any extra dependencies, e.g., on GSL.
For a minimal version of the way I am currently able to do this with uniform random numbers on-the-fly:
from libc.stdlib cimport rand, RAND_MAX
import numpy as np
cdef double random_uniform():
cdef double r = rand()
return r/RAND_MAX
def my_function(int n):
cdef int i
cdef double[:] result = np.zeros(n, dtype='f8', order='C')
for i in range(n):
result[i] = random_uniform()
return result
The above code is functionally equivalent to numpy.random.rand(n), and can be compiled with the following minimal setup file:
from distutils.core import setup
from Cython.Build import cythonize
import numpy as np
setup(ext_modules=cythonize("example.pyx"), include_dirs=[np.get_include()])
# compile instructions:
# python setup.py build_ext --inplace
To answer this question, what I am looking for is the same kind of minimal solution for the functional equivalent of np.random.randn(n), again ideally with any dependency directly cimported from libc.stdlib for reasons of portability.
There is an example implementation on the Wikipedia entry for the Box-Muller algorithm, but I have had trouble implementing it due to the way that the constant epsilon is defined.
We can use randbelow() function from the secrets module to generate random integers. The below example uses randbelow() to randomly print integers.
The uniform probability distribution is used to generate random numbers from other distributions and also is useful as a “first guess” if no information about a random variable X is known other than that it is between a and b.
I created a function that generates gaussian-distributed random numbers based on the polar version of the Box-Muller transformation, as described by the pseudocode here. (I originally found this on the page archived here.)
This method generates two gaussian-distributed random numbers at a time. That means to get full cython
speed, we need to figure out a way to pass two numbers around without turning them into Python objects. The most straightforward way to do so (that I can think of) is to pass the buffer in for direct manipulation by the generator. That's what my_gaussian_fast
does, and it beats numpy
by a modest margin.
from libc.stdlib cimport rand, RAND_MAX
from libc.math cimport log, sqrt
import numpy as np
import cython
cdef double random_uniform():
cdef double r = rand()
return r / RAND_MAX
cdef double random_gaussian():
cdef double x1, x2, w
w = 2.0
while (w >= 1.0):
x1 = 2.0 * random_uniform() - 1.0
x2 = 2.0 * random_uniform() - 1.0
w = x1 * x1 + x2 * x2
w = ((-2.0 * log(w)) / w) ** 0.5
return x1 * w
@cython.boundscheck(False)
cdef void assign_random_gaussian_pair(double[:] out, int assign_ix):
cdef double x1, x2, w
w = 2.0
while (w >= 1.0):
x1 = 2.0 * random_uniform() - 1.0
x2 = 2.0 * random_uniform() - 1.0
w = x1 * x1 + x2 * x2
w = sqrt((-2.0 * log(w)) / w)
out[assign_ix] = x1 * w
out[assign_ix + 1] = x2 * w
@cython.boundscheck(False)
def my_uniform(int n):
cdef int i
cdef double[:] result = np.zeros(n, dtype='f8', order='C')
for i in range(n):
result[i] = random_uniform()
return result
@cython.boundscheck(False)
def my_gaussian(int n):
cdef int i
cdef double[:] result = np.zeros(n, dtype='f8', order='C')
for i in range(n):
result[i] = random_gaussian()
return result
@cython.boundscheck(False)
def my_gaussian_fast(int n):
cdef int i
cdef double[:] result = np.zeros(n, dtype='f8', order='C')
for i in range(n // 2): # Int division ensures trailing index if n is odd.
assign_random_gaussian_pair(result, i * 2)
if n % 2 == 1:
result[n - 1] = random_gaussian()
return result
Tests. Here's a uniform benchmark:
In [3]: %timeit numpy.random.uniform(size=10000)
10000 loops, best of 3: 130 µs per loop
In [4]: %timeit numpy.array(example.my_uniform(10000))
10000 loops, best of 3: 85.4 µs per loop
So this is definitely faster than numpy
for ordinary random numbers. And if we're smart about it, it's faster for gaussian random numbers too:
In [5]: %timeit numpy.random.normal(size=10000)
1000 loops, best of 3: 393 µs per loop
In [6]: %timeit numpy.array(example.my_gaussian(10000))
1000 loops, best of 3: 542 µs per loop
In [7]: %timeit numpy.array(example.my_gaussian_fast(10000))
1000 loops, best of 3: 266 µs per loop
As confirmed by Robert Kern, numpy
uses both values generated. my_gaussian
throws one away; my_gaussian_fast
uses both and stores them quickly. (See this answer's history for a naive my_gaussian_pair
that tries to return the pair in a slow way.)
You say that you have issues implementing the Box-Muller transform because of the way they define espilon
:
const double epsilon = std::numeric_limits<double>::min();
According to here this is the C equivalent:
const double lowest_double = -DBL_MAX;
So to get the correct imports in Cython:
from libc.float import DBL_MAX #it should still be portable btw.
Now the issue with epsilon
should be resolved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With