PEP-557 introduced data classes into Python standard library, that basically can fill the same role as collections.namedtuple
and typing.NamedTuple
. And now I'm wondering how to separate the use cases in which namedtuple is still a better solution.
Of course, all the credit goes to dataclass
if we need:
property
decorators, manageable attributesData classes advantages are briefly explained in the same PEP: Why not just use namedtuple.
But how about an opposite question for namedtuples: why not just use dataclass? I guess probably namedtuple is better from the performance standpoint but found no confirmation on that yet.
Let's consider the following situation:
We are going to store pages dimensions in a small container with statically defined fields, type hinting and named access. No further hashing, comparing and so on are needed.
NamedTuple approach:
from typing import NamedTuple
PageDimensions = NamedTuple("PageDimensions", [('width', int), ('height', int)])
DataClass approach:
from dataclasses import dataclass
@dataclass
class PageDimensions:
width: int
height: int
Which solution is preferable and why?
P.S. the question isn't a duplicate of that one in any way, because here I'm asking about the cases in which namedtuple is better, not about the difference (I've checked docs and sources before asking)
DataClass is slower than others while creating data objects (2.94 µs). NamedTuple is the faster one while creating data objects (2.01 µs). An object is slower than DataClass but faster than NamedTuple while creating data objects (2.34 µs).
In general, you can use namedtuple instances wherever you need a tuple-like object. Named tuples have the advantage that they provide a way to access their values using field names and the dot notation. This will make your code more Pythonic.
Namedtuple makes your tuples self-document. You can easily understand what is going on by having a quick glance at your code. And as you are not bound to use integer indexes to access members of a tuple, it makes it more easy to maintain your code.
Benefits of Python Namedtuple Unlike a regular tuple, python namedtuple can also access values using names of the fields. 2. Python namedtuple is just as memory-efficient as a regular tuple, because it does not have per-instance dictionaries. This is also why it is faster than a dictionary.
It depends on your needs. Each of them has own benefits.
Here is a good explanation of Dataclasses on PyCon 2018 Raymond Hettinger - Dataclasses: The code generator to end all code generators
In Dataclass
all implementation is written in Python, whereas in NamedTuple
, all of these behaviors come for free because NamedTuple
inherits from tuple
. And because the tuple
structure is written in C, standard methods are faster in NamedTuple
(hash, comparing and etc).
Note also that Dataclass
is based on dict
whereas NamedTuple
is based on tuple
. Thus, you have advantages and disadvantages of using these structures. For example, space usage is less with a NamedTuple
, but time access is faster with a Dataclass
.
Please, see my experiment:
In [33]: a = PageDimensionsDC(width=10, height=10)
In [34]: sys.getsizeof(a) + sys.getsizeof(vars(a))
Out[34]: 168
In [35]: %timeit a.width
43.2 ns ± 1.05 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [36]: a = PageDimensionsNT(width=10, height=10)
In [37]: sys.getsizeof(a)
Out[37]: 64
In [38]: %timeit a.width
63.6 ns ± 1.33 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
But with increasing the number of attributes of NamedTuple
access time remains the same small, because for each attribute it creates a property with the name of the attribute. For example, for our case the part of the namespace of the new class will look like:
from operator import itemgetter
class_namespace = {
...
'width': property(itemgetter(0, doc="Alias for field number 0")),
'height': property(itemgetter(0, doc="Alias for field number 1"))**
}
In which cases namedtuple is still a better choice?
When your data structure needs to/can be immutable, hashable, iterable, unpackable, comparable then you can use NamedTuple
. If you need something more complicated, for example, a possibility of inheritance for your data structure then use Dataclass
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With