Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cython : pure C loop optimization

Quoting the Cython documentation:

Cython recognises the usual Python for-in-range integer loop pattern:
    for i in range(n):
        ...
If i is declared as a cdef integer type, it will optimise this into a pure C loop

I wrote two versions of a simple Cython function, one using the Python range, the other one using the for-from Pyrex notation (which is supposed to be deprecated):

 def loop1(int start, int stop, int step):
    cdef int x, t = 0
    for x in range(start, stop, step):
        t += x
    return t

def loop2(int start, int stop, int step):
    cdef int x, t = 0
    for x from start <= x < stop by step:
        t += x
    return t

By looking at .cfile, I noticed that the two loops have been processed in a very different way:

The first one is actually creating a Python range, using Python objects. And it comes with 50 lines of unnecessary Python-to-C C-to-Python stuff.

The second one has been optimized into a nice pure C loop:

__pyx_t_1 = __pyx_v_stop;
__pyx_t_2 = __pyx_v_step;
for (__pyx_v_x = __pyx_v_start; __pyx_v_x < __pyx_t_1; __pyx_v_x+=__pyx_t_2) {

Am I missing something or is it a bug that I should report?

like image 653
Vincent Avatar asked Jan 27 '14 13:01

Vincent


3 Answers

The docs do mention this:

Automatic range conversion

This will convert statements of the form for i in range(...) to for i from ... when i is any cdef’d integer type, and the direction (i.e. sign of step) can be determined.

I suppose Cython wants to know the sign of step at compile time in order to generate < or > in the C loop's end condition.

See also Ticket #546 on Cython Trac

like image 159
Janne Karila Avatar answered Nov 16 '22 19:11

Janne Karila


I know this is a very old question, but for people googling and ending up here I will post an answer.

https://github.com/cython/cython/issues/3310#issuecomment-707252866

for i in range(start, stop, step):
    ...

to this:

i = start - step
for _ in range(((stop - start) + (step - 1))//step):
    i += step
    ...    
like image 27
Underscore_ Avatar answered Nov 16 '22 19:11

Underscore_


Actually, it is possible to convert any for loop over a range into a fully-optimized C loop, assuming the start, stop, and step are C variables. You just have to write it a little cleverly.

Start with loop1():

def loop1(int start, int stop, int step):
    cdef int x, t = 0
    for x in range(start, stop, step):
        t += x
    return t

Cython (currently) doesn't know how to optimize this because it doesn't know the sign of step. As it turns out, the simplest solution to this problem is the solution to a slightly more general problem. To wit:

def loop1(int start, int stop, int step):
    cdef:
        int x
        int count
        int t = 0
    for count, x in enumerate(range(start, stop, step)):
        t += x
    return t

The count variable looks useless, but a different version of the problem might use it in the loop body.

Now, calculate the index by hand:

def loop1(int start, int stop, int step):
    cdef:
        int x
        int count
        int length
        int t = 0
    length = len(range(start, stop, step))  # could optimize this further
    for count in range(length):
        x = start + count*step
        t += x
    return t

I have tried this, and I know it produces pure C code (except for the length = line). In fact, I've successfully used it inside a nogil block. cython -a shows all-white output for the loop itself.

This will create two extra variables, along with some dead stores etc., but I assume any halfway-decent compiler should be capable of eliminating those on -O2. It is therefore suitable for high-performance loops.

like image 29
Kevin Avatar answered Nov 16 '22 19:11

Kevin