Python 3 - Which one is faster for accessing data: dataclasses or dictionaries?

1 Answers

All classes in python actually use a dictionary under the hood to store their attributes, as you can read here in the documentation. For a more in-depth reference on how python classes (and many more things) work, you can also check out the article on python's datamodel, in particular the section on custom classes.

So in general, there shouldn't be a loss in performance by moving from dictionaries to dataclasses. But it's better to make sure with the timeit module:

Baseline

# dictionary creation
$ python -m timeit "{'var': 1}"
5000000 loops, best of 5: 52.9 nsec per loop

# dictionary key access
$ python -m timeit -s "d = {'var': 1}" "d['var']"
10000000 loops, best of 5: 20.3 nsec per loop

Basic dataclass

# dataclass creation
$ python -m timeit -s "from dataclasses import dataclass" -s "@dataclass" -s "class A: var: int" "A(1)" 
1000000 loops, best of 5: 288 nsec per loop

# dataclass attribute access
$ python -m timeit -s "from dataclasses import dataclass" -s "@dataclass" -s "class A: var: int" -s "a = A(1)" "a.var" 
10000000 loops, best of 5: 25.3 nsec per loop

Here we can see that using classes does have some overhead. For class creation it's quite a bit (~5 times slower), but you don't necessarily need to care that much about it as long as you don't plan to create and toss your dataclasses multiple times per second.

The attribute access is probably the more important metric, and while dataclasses are again slower (~1.25 times), this time it's not by that much.

If you think that's still a tad too slow, you can tune your dataclass (or any classes, really) by using slots instead of a dictionary to store their attributes:

Slotted dataclass

# dataclass creation
$ python -m timeit -s "from dataclasses import dataclass" -s "@dataclass" -s "class A: __slots__ = ('var',); var: int" "A(1)" 
1000000 loops, best of 5: 242 nsec per loop

# dataclass attribute access
$ python -m timeit -s "from dataclasses import dataclass" -s "@dataclass" -s "class A: __slots__ = ('var',); var: int" -s "a = A(1)" "a.var"
10000000 loops, best of 5: 21.7 nsec per loop

By using this pattern we could shave off a few more more nanoseconds. At this point, at least regarding attribute access, there shouldn't be a noticeable difference to dictionaries any more, and you can use the upsides of dataclasses without compromising speed.

181

answered Oct 09 '22 06:10

Arne

Related questions
                            
                                Simplest way to retry SQLite query if DB is locked?
                            
                                Remove points which contains pixels fewer than (N)
                            
                                How do I define a settings.LOGGING so that gunicorn will find the version value it wants?
                            
                                Need to find text with RegEx and BeautifulSoup
                            
                                ugettext and ugettext_lazy functions not recognized by makemessages in Python Django
                            
                                Python split string on quotes
                            
                                Gensim: How to save LDA model's produced topics to a readable format (csv,txt,etc)?
                            
                                Python inspect.stack is slow
                            
                                Calling multiple commands using os.system in Python
                            
                                Finding Bluetooth low energy with python
                            
                                pip install UnicodeDecodeError
                            
                                Installing pzmq with Cygwin
                            
                                Filter list of dictionaries
                            
                                Print Last Line of File Read In with Python
                            
                                Couldn't find foreign struct converter for 'cairo.Context'
                            
                                Calculating Cross Entropy in TensorFlow
                            
                                Python could not import the module virtualenvwrapper.hook_loader?
                            
                                How do you read in a dataframe with lists using pd.read_clipboard?
                            
                                Python - Remove decimal and zero from string
                            
                                how to mention password field in serializer?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python 3 - Which one is faster for accessing data: dataclasses or dictionaries?

Tags:

python

dictionary

python-3.7

python-dataclasses

sergiomafra

People also ask

1 Answers

Arne

Recent Activity

Donate For Us