Why is setitem much faster than an equivalent "normal" method for cdef-classes?

Tags:

It looks like, for Cython's cdef-classes, using class special methods is sometimes faster than identical "usual" method, for example __setitem__ is 3 times faster than setitem:

%%cython
cdef class CyA:
    def __setitem__(self, index, val):
        pass
    def setitem(self, index, val):
        pass

and now:

cy_a=CyA()
%timeit cy_a[0]=3              # 32.4 ns ± 0.195 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit cy_a.setitem(0,3)      # 97.5 ns ± 0.389 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

This neither the "normal" behavior for Python, for which the special functions are even somewhat slower (and obviosly slower than the Cython-equivalent):

class PyA:
    def __setitem__(self, index, val):
        pass
    def setitem(self, index, val):
        pass

py_a=PyA()
%timeit py_a[0]=3           # 198 ns ± 2.51 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit py_a.setitem(0,3)   # 123 ns ± 0.619 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

nor this is the case in Cython for all special functions:

%%cython
cdef class CyA:
    ...
    def __len__(self):
        return 1
    def len(self):
        return 1

which leads to:

cy_a=CyA()
%timeit len(cy_a)    #  59.6 ns ± 0.233 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit cy_a.len()   #  66.5 ns ± 0.326 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

i.e. almost identical running times.

Why is __setitem__(...) so much faster, than setitem(...) in a cdef-class, even if both are cythonized?

697

asked Nov 28 '18 21:11

ead

1 Answers

There's quite a bit of overhead for a generic Python method call - Python looks up the relevant attribute (a dictionary lookup), ensures that the attribute is a callable object, and once it's called handles the result. This overhead also applies to generic def functions for cdef classes (the only difference being is that the implementation of the method is defined in C).

However, special methods on C/Cython classes can be optimised, as follows:

Lookup speed

As a shortcut, PyTypeObject in the Python C API defines a number of different "slots" - direct function pointers for special methods. For __setitem__ there's actually two available: PyMappingMethods.mp_ass_subscript which corresponds to a generic "mapping" call, and PySequenceMethods.sq_ass_item, which lets you use an int as the indexer directly and corresponds to the C API function PySequence_SetItem.

For a cdef class, Cython only seems to generate the first (generic) one, so the speedup isn't from passing a C int directly. Cython does not fill these slots when generating a non-cdef class.

The advantage of these is that (for a C/Cython class) finding the __setitem__ function just involves a couple of pointer NULL checks followed by a C function call. This also applies to __len__ which is also defined by slots in PyTypeObject

In contrast,

for a Python class calling __setitem__, it instead uses a default implementation which does a dictionary lookup for the string "__setitem__".
For either a cdef or Python class calling a non-special def function, the attribute is looked up from the class/instance dictionary (which is slower)

Note that if the setitem regular function were to be defined in a cdef class as cpdef instead (and called from Cython) then Cython implements its own mechanism for a speedy lookup.

Calling efficiency

Having found the attribute it must be called. Where the special functions have been retrieved from PyTypeObject (e.g. __setitem__ and __len__ on a cdef class), they are simply C function pointers and so can be called directly.

For every other case the PyObject retrieved from attribute lookup must evaluated to see if it's a callable, then called.

Return handling

When __setitem__ is called from PyTypeObject as a special function the return value is an int, which is simply used as an error flag. No reference counting or handling of Python objects is needed.

When __len__ is called from a PyTypeObject as a special function, the return type is a Py_ssize_t, which must be converted to a Python object and then destroyed when no longer needed.

For normal functions (e.g. setitem called from a Python or Cython class, or __setitem__ defined in a Python class), the return value is a PyObject*, which must be reference counted/destroyed appropriately.

In summary, the difference is really to do with shortcuts in finding and calling the function rather than whether the contents of the function is Cythonized.

193

answered Oct 04 '22 18:10

DavidW

Related questions
                            
                                Convert integer to binary and then do a left bit shift in python
                            
                                Need help understanding cross_val_score in sklearn python
                            
                                Migrate csv from gcs to postgresql
                            
                                How to read json file containing ObjectId and ISODate in Python?
                            
                                Different results for tensorflowjs and keras on same model and tensor
                            
                                Use a colormap as a palette in Seaborn
                            
                                How to import pandas using R studio
                            
                                Run two async functions without blocking each other
                            
                                Firebase (client-side vs server-side)
                            
                                Pyarrow apply schema when using pandas to_parquet()
                            
                                Using BeautifulSoup 4 and recursion to capture the structure of HTML nested tags
                            
                                How to install cython an Anaconda 64 bits with Windows 10?
                            
                                invalid syntax cause by += in ternany
                            
                                Capturing game screenshots for use by a Python script
                            
                                Format Airflow Logs in JSON
                            
                                Caching/reusing a DB connection for later view usage
                            
                                Python Create Bar Chart Comparing 2 sets of data
                            
                                GAE AttributeError: 'Credentials' object has no attribute 'with_subject'
                            
                                NumPy - Insert an array of zeros after specified indices
                            
                                ModuleNotFoundError: No module named 'xxxdjango'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is setitem much faster than an equivalent "normal" method for cdef-classes?

Tags:

performance

python

python-3.x

cython

ead

People also ask

1 Answers

Lookup speed

Calling efficiency

Return handling

DavidW

Recent Activity

Donate For Us

Why is __setitem__ much faster than an equivalent "normal" method for cdef-classes?

Tags:

performance

python

python-3.x

cython

ead

People also ask

1 Answers

Lookup speed

Calling efficiency

Return handling

DavidW

Related questions

Recent Activity

Donate For Us

Why is setitem much faster than an equivalent "normal" method for cdef-classes?