Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

namedtuple slow compared to tuple/dictionary/class

Tags:

python-3.x

I am curious on why is namedtuple slower than a regular class in python. Consider the following:

In [1]: from collections import namedtuple

In [2]: Stock = namedtuple('Stock', 'name price shares')  

In [3]: s = Stock('AAPL', 750.34, 90)

In [4]: %%timeit 
   ...: value = s.price * s.shares
   ...:          
175 ns ± 1.17 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [5]: class Stock2:
   ...:     __slots__ = ('name', 'price', 'shares')
   ...:     def __init__(self, name, price, shares):
   ...:         self.name = name       
   ...:         self.price = price
   ...:         self.shares = shares

In [6]: s2 = Stock2('AAPL', 750.34, 90)

In [8]: %%timeit
   ...: value = s2.price * s2.shares
   ...:                                
106 ns ± 0.832 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [9]: class Stock3:                  
   ...:     def __init__(self, name, price, shares):
   ...:         self.name = name 
   ...:         self.price = price     
   ...:         self.shares = shares

In [10]: s3 = Stock3('AAPL', 750.34, 90)

In [11]: %%timeit                      
    ...: value = s3.price * s3.shares
    ...:         
118 ns ± 3.54 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [12]: t = ('AAPL', 750.34, 90)

In [13]: %%timeit         
    ...: values = t[1] * t[2]          
    ...:
93.8 ns ± 1.13 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [14]: d = dict(name='AAPL', price=750.34, shares=90)                                

In [15]: %%timeit                          
...: value = d['price'] * d['shares']
...:                                   
92.5 ns ± 0.37 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

I expected namedtuple to come before a class without slots. This is on python3.6. Also pretty amazing that dictionary's performance is comparable to a tuple.

like image 382
skgbanga Avatar asked Mar 22 '18 01:03

skgbanga


People also ask

Is Namedtuple slower?

DataClass is slower than others while creating data objects (2.94 µs). NamedTuple is the faster one while creating data objects (2.01 µs).

Which is faster tuple or dictionary?

It is well-known that in Python tuples are faster than lists, and dicts are faster than objects.

What is the difference between tuple and Namedtuple?

Tuples are immutable, whether named or not. namedtuple only makes the access more convenient, by using names instead of indices. You can only use valid identifiers for namedtuple , it doesn't perform any hashing — it generates a new type instead.

What is the difference between Namedtuple and dictionary?

Python Namedtuple vs Dict: Python namedtuples are immutable while a dictionary is mutable in python. A namedtuple can be used to access elements in the tuple using the names and not just the indexes while the data in a python dict can be accessed using a key:value pairing.


1 Answers

For an instance of python class, set and get attribute through dot notation is mainly through __dict__[attribute_name](__dict__ itself is an attribute, and it is a dictionary) of the instance, depending on the value of __dict__[attribute_name] call it v, there are different behaviour.

case one: v is not a descriptor, so the dot notation just return v.
case two: v is a descriptor, the result would be fetched from the descriptor's __get__ method.

For simple class instance in your description: easy it would be case one

For namedtuple situation: take a look at the namedtuple source code, function namedtuple is making a class by using this template, inside which named field store in the dict as a property.
Where property is descriptor, which the itemgetter instance would be used in the descriptor's __get__ method!!! Here is the property class in python code lay in the comment of the c source code in Cpython:

class property(object):

    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        if doc is None and fget is not None and hasattr(fget, "__doc__"):
            doc = fget.__doc__
        self.__get = fget
        self.__set = fset
        self.__del = fdel
        self.__doc__ = doc

    def __get__(self, inst, type=None):
        if inst is None:
            return self
        if self.__get is None:
            raise AttributeError, "unreadable attribute"
        return self.__get(inst)

    def __set__(self, inst, value):
        if self.__set is None:
            raise AttributeError, "can't set attribute"
        return self.__set(inst, value)

    def __delete__(self, inst):
        if self.__del is None:
            raise AttributeError, "can't delete attribute"
        return self.__del(inst)

Summarize the above, we should understand why namedtupple access is slower, there are extra steps to get value from the instance of the class made by namedtuple than simple class.

If you want to dig deep, to see how namedtuple storing and getting value, you can read the source code of python3.6 through the above links.

An hint:
the class namedtuple creating is a subclass, storing the field value as tuple, and store the related index with the name of it through property.

like image 169
patpat Avatar answered Nov 13 '22 21:11

patpat