Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should list item type be defined in cython?

If I send a python list to a cython function to iterate over, am I suppose to declare what type the list items are? Also what is the best way to loop over a list in cython? For example:

#Cython function, passed a list of float items
def cython_f(list example_list):
    cdef int i
    for i in range(len(example_list)):
        #Do stuff
        #but list item type not defined?
        pass

    #Alternative loop
    cdef j float #declaration of list item type
    for j in example_list:
        #Do stuff
        pass

Is any speed gained from trying to define list item type? Is it preferable to pass numpy arrays instead of python lists?

like image 373
kezzos Avatar asked May 16 '14 22:05

kezzos


People also ask

Is Cython object oriented?

[Cython] is a programming language that makes writing C extensions for the Python language as easy as Python itself. It aims to become a superset of the [Python] language which gives it high-level, object-oriented, functional, and dynamic programming.

What is Cdef in Python?

Cython specific cdef syntax, which was designed to make type declarations concise and easily readable from a C/C++ perspective. Pure Python syntax which allows static Cython type declarations in pure Python code, following PEP-484 type hints and PEP 526 variable annotations.


1 Answers

In Cython you are not obliged to declare anything. Declaring types usually helps with performance. The usually is because if you declare types, but then don't use them, you may induce type checks and pack-unpack. The only way to be sure is to measure.

To declare the types of the list, just put at the beginning cdef float value, and in the loop value = example_list[i].

Should you use list or numpy array? An array is an uniform data container. This means that you can declare it as being float32_t, and Cython will know how to work with that at C speed (accessing is faster, as it is guaranteed to be contiguous and strided in memory). On the other hand, if you are going to change the size, you are probably better using lists (or for very heavy use, perhaps libcpp.vector). So the answer is it depends on what you do, but in most cases, an array is better.

To be fair, you have to consider how is the data living. If you have everything in lists, your function with arrays may be faster, but list -> array -> f_array -> array -> list may be slower than list -> f_list -> list. If you don't care, as a rule of thumb, use arrays when the length will be constant and lists otherwise. Also note that numpy arrays are lighter on the memory for big amounts of data.

like image 89
Davidmh Avatar answered Sep 18 '22 15:09

Davidmh