Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interesting performance of creating objects via normal class, data class and named tuple

I was going through data classes and named tuple. I found this behaviour where creating objects using different features of python have different performance.

dataclass:

In [1]: from dataclasses import dataclass
   ...:
   ...: @dataclass
   ...: class Position:
   ...:     lon: float = 0.0
   ...:     lat: float = 0.0
   ...:

In [2]: %timeit for _ in range(1000): Position(12.5, 345)
326 µs ± 34.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Normal class:

In [1]: class Position:
   ...:
   ...:     def __init__(self, lon=0.0, lat=0.0):
   ...:         self.lon = lon
   ...:         self.lat = lat
   ...:

In [2]: %timeit for _ in range(1000): Position(12.5, 345)
248 µs ± 2.89 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

namedtuple:

In [2]: Position = namedtuple("Position", ["lon","lat"], defaults=[0.0,0.0])

In [3]: %timeit for _ in range(1000): Position(12.5, 345)
286 µs ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
  • Python version: 3.7.3
  • OS: MacOS Mojave

All implementations have same object attributes, same default values.

  1. Why is this trend of time(dataclass) > time(namedtuple) > time(normal class)?
  2. What does each implementation do to take their respective time?
  3. Which implementation is best performing in what scenario?

Here, time denotes time taken for creating objects.

like image 273
bigbounty Avatar asked Jul 15 '20 10:07

bigbounty


People also ask

What is one advantage of classes over named tuples?

Data classes advantages over NamedTuplemutable objects. inheritance support. property decorators, manageable attributes. generated method definitions out of the box or customizable method definitions.

Are named tuples faster than dictionaries?

And as you are not bound to use integer indexes to access members of a tuple, it makes it more easy to maintain your code. Moreover, as namedtuple instances do not have per-instance dictionaries, they are lightweight and require no more memory than regular tuples. This makes them faster than dictionaries.

Is named tuple fast?

NamedTuple is the faster one while creating data objects (2.01 µs). An object is slower than DataClass but faster than NamedTuple while creating data objects (2.34 µs).


1 Answers

In Python everything is a dict. In case of data class there are more entries in that dict, so in turn that takes more time to put them there.

How that change happened? @Arne's comment spotted that I'm missing something here. I did sample code:

from dataclasses import dataclass
import time

@dataclass
class Position:
    lon: float = 0.0
    lat: float = 0.0


start_time = time.time()
for i in range(100000):
    p = Position(lon=1.0, lat=1.0)
elapsed = time.time() - start_time
print(f"dataclass {elapsed}")
print(dir(p))


class Position2:
    lon: float = 0.0
    lat: float = 0.0

    def __init__(self, lon, lat):
        self.lon = lon
        self.lat = lat


start_time = time.time()
for i in range(100000):
    p = Position2(lon=1.0, lat=1.0)
elapsed = time.time() - start_time
print(f"just class {elapsed}")
print(dir(p))

start_time = time.time()
for i in range(100000):
    p = {"lon": 1.0, "lat": 1.0}
elapsed = time.time() - start_time
print(f"dict {elapsed}")

With results:

/usr/bin/python3.8 ...../test.py
dataclass 0.16358232498168945
['__annotations__', '__class__', '__dataclass_fields__', '__dataclass_params__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'lat', 'lon']
just class 0.1495649814605713
['__annotations__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'lat', 'lon']
dict 0.028212785720825195

Process finished with exit code 0

Dict example is for reference.

Looked into dataclass, this function:

(489) def _init_fn(fields, frozen, has_post_init, self_name, globals):

is responsible for creation of constructor. As Arne spotted - post_init code is optional, and not generated. I had other idea, that there is some work around fields, but:

In [5]: p = Position(lat = 1.1, lon=2.2)                                                                                                                                                                           

In [7]: p.lat.__class__                                                                                                                                                                                            
Out[7]: float

so there is no additional wraps / code here. From all of that the only additional stuff I saw - is that more methods.

like image 130
Michał Zaborowski Avatar answered Oct 18 '22 04:10

Michał Zaborowski