How can I know whether to use def, cdef or cpdef when defining a Cython function, assuming I want optimal performance?
Variable and Type Definitions The cdef statement is used to declare C variables, either local or module-level: cdef int i, j, k cdef float f, g[42], *h. In C, types can be given names using the typedef statement. The equivalent in Cython is ctypedef : ctypedef int * intPtr.
cdef declares function in the layer of C language. As you know (or not?) in C language you have to define type of returning value for each function. Sometimes function returns with void , and this is equal for just return in Python. Python is an object-oriented language.
cdef functions are quicker to call than def functions because they translate to a simple C function call. cpdef functions cause Cython to generate a cdef function (that allows a quick function call from Cython) and a def function (which allows you to call it from Python). Interally the def function just calls the cdef function.
Declaring the types of arguments and local types (thus return values) can allow Cython to generate optimised code which speeds up the execution. If the types are declared then a TypeError will be raised if the function is passed the wrong types. cdef is used for Cython functions that are intended to be pure ‘C’ functions.
cdef functions can also specify a return type (if it is not specified then they return a Python object, PyObject* in C). def functions always return a Python object, so cannot specify a return type: cdef int h (int* a): # specify a return type and take a non-Python compatible argument return a [0]
This exploits early binding so that cpdef functions may be as fast as possible when using C fundamental types (by using cdef ). cpdef functions use dynamic binding when passed Python objects and this might much slower, perhaps as slow as def declared functions.
If you want optimal performance, you should know that as mentioned in this answer to a related question:
Once the function has been called there is no difference in the speed that the code inside a
cdef
and adef
function runs at.
So for optimal Cython performance you should always statically type all arguments and variables, and intuitively you would then be tempted to use cdef
, but there are some caveats for which I constructed the flowchart below (also based on previously mentioned answer):
Furthermore, note that:
cpdef
functions cause Cython to generate acdef
function (that allows a quick function call from Cython) and adef
function (which allows you to call it from Python). Interally thedef
function just calls thecdef
function.
... and from the Cython documentation:
This exploits early binding so that
cpdef
functions may be as fast as possible when using C fundamental types (by usingcdef
).cpdef
functions use dynamic binding when passed Python objects and this might much slower, perhaps as slow asdef
declared functions.
There exists also a case-specific benchmark in the Cython documentation (calling the function often and from Python) which yields the following result:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With