Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterating over arrays in cython, is list faster than np.array?

TLDR: in cython, why (or when?) is iterating over a numpy array faster than iterating over a python list?

Generally: I've used Cython before and was able to get tremendous speed ups over naive python impl', However, figuring out what exactly needs to be done seems non-trivial.

Consider the following 3 implementations of a sum() function. They reside in a cython file called 'cy' (obviously, there's np.sum(), but that's besides my point..)

Naive python:

def sum_naive(A):
   s = 0
   for a in A:
       s += a
   return s

Cython with a function that expects a python list:

def sum_list(A):
    cdef unsigned long s = 0
    for a in A:
        s += a
    return s

Cython with a function that expects a numpy array.

def sum_np(np.ndarray[np.int64_t, ndim=1] A):
    cdef unsigned long s = 0
    for a in A:
        s += a
    return s

I would expect that in terms of running time, sum_np < sum_list < sum_naive, however, the following script demonstrates to the contrary (for completeness, I added np.sum() )

N = 1000000
v_np = np.array(range(N))
v_list = range(N)

%timeit cy.sum_naive(v_list)
%timeit cy.sum_naive(v_np)
%timeit cy.sum_list(v_list)
%timeit cy.sum_np(v_np)
%timeit v_np.sum()

with results:

In [18]: %timeit cyMatching.sum_naive(v_list)
100 loops, best of 3: 18.7 ms per loop

In [19]: %timeit cyMatching.sum_naive(v_np)
1 loops, best of 3: 389 ms per loop

In [20]: %timeit cyMatching.sum_list(v_list)
10 loops, best of 3: 82.9 ms per loop

In [21]: %timeit cyMatching.sum_np(v_np)
1 loops, best of 3: 1.14 s per loop

In [22]: %timeit v_np.sum()
1000 loops, best of 3: 659 us per loop

What's going on? Why is cython+numpy slow?

P.S.
I do use
#cython: boundscheck=False
#cython: wraparound=False

like image 880
Tomer Levinboim Avatar asked Sep 21 '13 21:09

Tomer Levinboim


People also ask

Are NP arrays faster than lists?

NumPy Arrays Are Faster Than Lists As predicted, we can see that NumPy arrays are significantly faster than lists.

Is NumPy written in Cython?

NumPy is mostly written in C. The main advantage of Python is that there are a number of ways of very easily extending your code with C (ctypes, swig,f2py) / C++ (boost. python, weave.


2 Answers

There is a better way to implement this in cython that at least on my machine beats np.sum because it avoids type checking and other things that numpy normally has to do when dealing with an arbitrary array:

#cython.wraparound=False
#cython.boundscheck=False
cimport numpy as np

def sum_np(np.ndarray[np.int64_t, ndim=1] A):
    cdef unsigned long s = 0
    for a in A:
        s += a
    return s

def sum_np2(np.int64_t[::1] A):
    cdef:
        unsigned long s = 0
        size_t k

    for k in range(A.shape[0]):
        s += A[k]

    return s

And then the timings:

N = 1000000
v_np = np.array(range(N))
v_list = range(N)

%timeit sum(v_list)
%timeit sum_naive(v_list)
%timeit np.sum(v_np)
%timeit sum_np(v_np)
%timeit sum_np2(v_np)
10 loops, best of 3: 19.5 ms per loop
10 loops, best of 3: 64.9 ms per loop
1000 loops, best of 3: 1.62 ms per loop
1 loops, best of 3: 1.7 s per loop
1000 loops, best of 3: 1.42 ms per loop

You don't want to iterate over the numpy array via the Python style, but rather access elements using indexing as that it can be translated into pure C, rather than relying on the Python API.

like image 166
JoshAdel Avatar answered Oct 17 '22 20:10

JoshAdel


a is untyped and thus there will be lots of conversions from Python to C types and back. These can be slow.

JoshAdel correctly pointed out that instead of iterating though, you should iterate over a range. Cython will convert the indexing to C, which is fast.


Using cython -a myfile.pyx will highlight these sorts of things for you; you want all of your loop logic to be white for maximum speed.

PS: Note that np.ndarray[np.int64_t, ndim=1] is outdated and has been deprecated in favour of the faster and more general long[:].

like image 30
Veedrac Avatar answered Oct 17 '22 20:10

Veedrac