Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the differences between a cpdef and a cdef wrapped in a def?

Tags:

python

cython

In the Cython docs there is an example where they give two ways of writing a C/Python hybrid method. An explicit one with a cdef for fast C access and a wrapper def for access from Python:

cdef class Rectangle:
    cdef int x0, y0
    cdef int x1, y1
    def __init__(self, int x0, int y0, int x1, int y1):
        self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1
    cdef int _area(self):
        cdef int area
        area = (self.x1 - self.x0) * (self.y1 - self.y0)
        if area < 0:
            area = -area
        return area
    def area(self):
        return self._area()

And one using cpdef:

cdef class Rectangle:
    cdef int x0, y0
    cdef int x1, y1
    def __init__(self, int x0, int y0, int x1, int y1):
        self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1
    cpdef int area(self):
        cdef int area
        area = (self.x1 - self.x0) * (self.y1 - self.y0)
        if area < 0:
            area = -area
        return area

I was wondering what the differences are in practical terms.

For example, is either method faster/slower when called from C/Python?

Also, when subclassing/overriding does cpdef offer anything that the other method lacks?

like image 492
Paul Panzer Avatar asked Feb 19 '18 10:02

Paul Panzer


2 Answers

chrisb's answer gives you all you need to know, but if you are game for gory details...

But first, the takeaways from the lengthy analysis bellow in a nutshell:

  • For free functions, there is not much difference between cpdef and rolling it out with cdef+def performance-wise. The resulting c-code is almost identical.

  • For bound methods, cpdef-approach can be slightly faster in the presence of inheritance-hierarchies, but nothing to get too excited about.

  • Using cpdef-syntax has its advantages, as the resulting code is clearer (at least to me) and shorter.


Free functions:

When we define something silly like:

 cpdef do_nothing_cp():
   pass

the following happens:

  1. a fast c-function is created (in this case it has a cryptic name __pyx_f_3foo_do_nothing_cp because my extension is called foo, but you actually have only to look for the f prefix).
  2. a python-function is also created (called __pyx_pf_3foo_2do_nothing_cp - prefix pf), it does not duplicate the code and call the fast function somewhere on the way.
  3. a python-wrapper is created, called __pyx_pw_3foo_3do_nothing_cp (prefix pw)
  4. do_nothing_cp method definition is issued, this is what the python-wrapper is needed for, and this is the place where is stored which function should be called when foo.do_nothing_cp is invoked.

You can see it in the produced c-code here:

 static PyMethodDef __pyx_methods[] = {
  {"do_nothing_cp", (PyCFunction)__pyx_pw_3foo_3do_nothing_cp, METH_NOARGS, 0},
  {0, 0, 0, 0}
};

For a cdef function, only the first step happens, for a def-function only steps 2-4.

Now when we load module foo and invoke foo.do_nothing_cp() the following happens:

  1. The function pointer bound to name do_nothing_cp is found, in our case the python-wrapper pw-function.
  2. pw-function is called via function-pointer, and calls the pf-function (as C-functionality)
  3. pf-function calls the fast f-function.

What happens if we call do_nothing_cp inside the cython-module?

def call_do_nothing_cp():
    do_nothing_cp()

Clearly, cython doesn't need the python machinery to locate the function in this case - it can directly use the fast f-function via a c-function call, bypassing pw and pf functions.

What happens if we wrap cdef function in a def-function?

cdef _do_nothing():
   pass

def do_nothing():
  _do_nothing()

Cython does the following:

  1. a fast _do_nothing-function is created, corresponding to the f- function above.
  2. a pf-function for do_nothing is created, which calls _do_nothing somewhere on the way.
  3. a python-wrapper, i.e. pw function is created which wraps the pf-function
  4. the functionality is bound to foo.do_nothing via function-pointer to the python-wrapper pw-function.

As you can see - not much difference to the cpdef-approach.

The cdef-functions are just simple c-function, but def and cpdef function are python-function of the first class - you could do something like this:

foo.do_nothing=foo.do_nothing_cp

As to performance, we cannot expect much difference here:

>>> import foo
>>> %timeit foo.do_nothing_cp
51.6 ns ± 0.437 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>> %timeit foo.do_nothing
51.8 ns ± 0.369 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

If we look at the resulting machine code (objdump -d foo.so), we can see that the C-compiler has inlined all calls for the cpdef-version do_nothing_cp:

 0000000000001340 <__pyx_pw_3foo_3do_nothing_cp>:
    1340:   48 8b 05 91 1c 20 00    mov    0x201c91(%rip),%rax      
    1347:   48 83 00 01             addq   $0x1,(%rax)
    134b:   c3                      retq   
    134c:   0f 1f 40 00             nopl   0x0(%rax)

but not for the rolled out do_nothing (I must confess, I'm a little bit surprised and don't understand the reasons yet):

0000000000001380 <__pyx_pw_3foo_1do_nothing>:
    1380:   53                      push   %rbx
    1381:   48 8b 1d 50 1c 20 00    mov    0x201c50(%rip),%rbx        # 202fd8 <_DYNAMIC+0x208>
    1388:   48 8b 13                mov    (%rbx),%rdx
    138b:   48 85 d2                test   %rdx,%rdx
    138e:   75 0d                   jne    139d <__pyx_pw_3foo_1do_nothing+0x1d>
    1390:   48 8b 43 08             mov    0x8(%rbx),%rax
    1394:   48 89 df                mov    %rbx,%rdi
    1397:   ff 50 30                callq  *0x30(%rax)
    139a:   48 8b 13                mov    (%rbx),%rdx
    139d:   48 83 c2 01             add    $0x1,%rdx
    13a1:   48 89 d8                mov    %rbx,%rax
    13a4:   48 89 13                mov    %rdx,(%rbx)
    13a7:   5b                      pop    %rbx
    13a8:   c3                      retq   
    13a9:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)

This could explain, why cpdef version is slightly faster, but anyway the difference is nothing compared to the overhead of a python-function-call.


Class-methods:

The situation is a little bit more complicated for class methods, because of the possible polymorphism. Let's start out with:

cdef class A:
   cpdef do_nothing_cp(self):
       pass

At first sight, there is not that much difference to the case above:

  1. A fast, c-only, f-prefix-version of the function is emitted
  2. A python (prefix pf) version is emitted, which calls the f-function
  3. A python wrapper (prefix pw) wraps the pf-version and is used for registration.
  4. do_nothing_cp is registered as a method of class A via tp_methods-pointer of the PyTypeObject.

As can be seen in the produced c-file:

static PyMethodDef __pyx_methods_3foo_A[] = {
      {"do_nothing", (PyCFunction)__pyx_pw_3foo_1A_1do_nothing_cp, METH_NOARGS, 0},
      ...
      {0, 0, 0, 0}
    }; 
.... 
static PyTypeObject __pyx_type_3foo_A = {
 ...
  __pyx_methods_3foo_A, /*tp_methods*/
 ...
};

Clearly, the bound version has to have the implicit parameter self as an additional argument - but there is more to it: The f-function performs a function-dispatch if called not from the corresponding pf function, this dispatch looks as follows (I keep only the important parts):

static PyObject *__pyx_f_3foo_1A_do_nothing_cp(CYTHON_UNUSED struct __pyx_obj_3foo_A *__pyx_v_self, int __pyx_skip_dispatch) {

  if (unlikely(__pyx_skip_dispatch)) ;//__pyx_skip_dispatch=1 if called from pf-version
  /* Check if overridden in Python */
  else if (look-up if function is overriden in __dict__ of the object)
     use the overriden function
  }
  do the work.

Why is it needed? Consider the following extension foo:

cdef class A:
  cpdef do_nothing_cp(self):
   pass

cdef class B(A):
  cpdef call_do_nothing(self):
    self.do_nothing()

What happens when we call B().call_do_nothing()?

  1. `B-pw-call_do_nothing' is located and called.
  2. it calls B-pf-call_do_nothing,
  3. which calls B-f-call_do_nothing,
  4. which calls A-f-do_nothing_cp, bypassing pw and pf-versions.

What happens when we add the following class C, which overrides the do_nothing_cp-function?

import foo
def class C(foo.B):
    def do_nothing_cp(self):
        print("I do something!")

Now calling C().call_do_nothing() leads to:

  1. call_do_nothing' of theC-class being located and called which means,pw-call_do_nothing' of the B-class being located and called,
  2. which calls B-pf-call_do_nothing,
  3. which calls B-f-call_do_nothing,
  4. which calls A-f-do_nothing (as we already know!), bypassing pw and pf-versions.

And now in the 4. step, we need to dispatch the call in A-f-do_nothing() in order to get the right C.do_nothing() call! Luckily we have this dispatch in the function at hand!

To make it more complicated: what if the class C were also a cdef-class? The dispatch via __dict__ would not work, because cdef-classes don't have __dict__?

For the cdef-classes, the polymorphism is implemented similar to C++'s "virtual tables", so in B.call_do_nothing() the f-do_nothing-function is not called directly but via a pointer, which depends on the class of the object (one can see those "virtual tables" being set up in __pyx_pymod_exec_XXX, e.g. __pyx_vtable_3foo_B.__pyx_base). Thus the __dict__-dispatch in A-f-do_nothing()-function is not needed in case of pure cdef-hierarchy.


As to performance, comparing cpdef with cdef+def I get:

                          cpdef         def+cdef
 A.do_nothing              107ns         108ns 
 B.call_nothing            109ns         116ns

so the difference isn't that large with, if someone, cpdef being slightly faster.

like image 197
ead Avatar answered Nov 15 '22 19:11

ead


See docs here - for most purposes they are practically the same, cpdef has slightly more overhead but plays nicer with inheritance.

The directive cpdef makes two versions of the method available; one fast for use from Cython and one slower for use from Python. Then:

This does slightly more than providing a python wrapper for a cdef method: unlike a cdef method, a cpdef method is fully overridable by methods and instance attributes in Python subclasses. It adds a little calling overhead compared to a cdef method.

like image 26
chrisb Avatar answered Nov 15 '22 20:11

chrisb