Basically I have a problem that is pretty much embrassing parallel and I think I've hit the limits of how fast I can make it with plain python & multiprocessing so I'm now attempting to take it to a lower level via Cython and hopefully openMP.
So in short I am wondering how I can employ openMP with Cython, or if I'll have to wrap some raw C code and load/bind to it via Cython?
Or can I have Cython compile down to C code then modify the C code to add in the openMP pragmas in then compile to library and load it into Python?
In this chapter we will learn about Cython's multithreading features to access thread-based parallelism. Our focus will be on the prange Cython function, which allows us to easily transform serial for loops to use multiple threads and tap into all available CPU cores.
Various Python packages such as Numpy, Scipy and pandas can utilize OpenMP to run on multiple CPUs.
Because Cython code compiles to C, it can interact with those libraries directly, and take Python's bottlenecks out of the loop. But NumPy, in particular, works well with Cython. Cython has native support for specific constructions in NumPy and provides fast access to NumPy arrays.
This question is from 3 years ago and nowadays Cython has available functions that support the OpenMP backend. See for example the documentation here. One very convenient function is the prange
. This is one example of how a (rather naive) dot
function could be implemented using prange
.
Don't forget to compile passing the "/opemmp"
argument to the C compiler.
import numpy as np
cimport numpy as np
import cython
from cython.parallel import prange
ctypedef np.double_t cDOUBLE
DOUBLE = np.float64
def mydot(np.ndarray[cDOUBLE, ndim=2] a, np.ndarray[cDOUBLE, ndim=2] b):
cdef np.ndarray[cDOUBLE, ndim=2] c
cdef int i, M, N, K
c = np.zeros((a.shape[0], b.shape[1]), dtype=DOUBLE)
M = a.shape[0]
N = a.shape[1]
K = b.shape[1]
for i in prange(M, nogil=True):
multiply(&a[i,0], &b[0,0], &c[i,0], N, K)
return c
@cython.wraparound(False)
@cython.boundscheck(False)
@cython.nonecheck(False)
cdef void multiply(double *a, double *b, double *c, int N, int K) nogil:
cdef int j, k
for j in range(N):
for k in range(K):
c[k] += a[j]*b[k+j*K]
If somebody stumbles over this question:
Now, there is direct support for OpenMP in cython via the cython.parallel module, see http://docs.cython.org/src/userguide/parallelism.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With