We were doing some experiments to compare the access time in classes and named tuples and observed something strange.
import time
from collections import namedtuple as namedtuple
class myclass(object):
def __init__(self, _name, _dob, _value):
self.name = _name
self.dob = _dob
self.value = _value
randomperson1 = myclass( 'randomperson', 10102000, 10.45 )
person = namedtuple( 'person', 'name dob value' )
randomperson2 = person( 'randomperson', 10102000, 10.45)
While using timeit of ipython, the following was observed:
%timeit randomperson1.name,randomperson1.value,randomperson1.dob
10000000 loops, best of 3: 125 ns per loop
%timeit randomperson2.name,randomperson2.value,randomperson2.dob
1000000 loops, best of 3: 320 ns per loop
%timeit randomperson2[0],randomperson2[1],randomperson2[2]
10000000 loops, best of 3: 127 ns per loop
Why's accessing a namedtuple by field name so much slower than accessing a class's member variable? Is there any way to speed this up?
That's because in namedtuple
attributes name, value, dob
are not simple attributes on the instance. They actually are turned into something more complicated
collections.py
_field_template = '''\
{name} = _property(_itemgetter({index:d}), doc='Alias for field number {index:d}')
'''
e.g.
dob = property(itemgetter(2), doc='Alias for field number 2')
So as you can see there are additional layers over it. People who created namedtuple
decided that they want consistency with memory efficiency at the cost of CPU efficiency. And that is the reason.
This can be easily observed when you create your own custom class emulating this:
from operator import itemgetter
class CustomTuple(tuple):
my_attr = property(itemgetter(0))
test_tuple = CustomTuple([1])
and now measure access to test_tuple.my_attr
. You should get pretty much the same results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With