Why is `dataclasses.asdict(obj)` > 10x slower than `obj.dict()`

Tags:

I am using Python 3.6 and the dataclasses backport package from ericvsmith.

It seems that calling dataclasses.asdict(my_dataclass) is ~10x slower than calling my_dataclass.__dict__:

In [172]: @dataclass      ...: class MyDataClass:      ...:     a: int      ...:     b: int      ...:     c: str      ...:   In [173]: %%time      ...: _ = [MyDataClass(1, 2, "A" * 1000).__dict__ for _ in range(1_000_000)]      ...:  CPU times: user 631 ms, sys: 249 ms, total: 880 ms Wall time: 880 ms  In [175]: %%time      ...: _ = [dataclasses.asdict(MyDataClass(1, 2, "A" * 1000)) for _ in range(1_000_000)]      ...:  CPU times: user 11.3 s, sys: 328 ms, total: 11.6 s Wall time: 11.7 s

Is this expected behavior? In what cases should I have to use dataclasses.asdict(obj) instead of obj.__dict__?

Edit: Using __dict__.copy() does not make a big difference:

In [176]: %%time      ...: _ = [MyDataClass(1, 2, "A" * 1000).__dict__.copy() for _ in range(1_000_000)]      ...:  CPU times: user 922 ms, sys: 48 ms, total: 970 ms Wall time: 970 ms

439

asked Sep 07 '18 20:09

ostrokach

1 Answers

In most cases where you would have used __dict__ without dataclasses, you should probably keep using __dict__, maybe with a copy call. asdict does a lot of extra work that you may not actually want. Here's what it does.

First, from the docs:

Each dataclass is converted to a dict of its fields, as name: value pairs. dataclasses, dicts, lists, and tuples are recursed into. For example:
@dataclass class Point:      x: int      y: int  @dataclass class C:      mylist: List[Point]  p = Point(10, 20) assert asdict(p) == {'x': 10, 'y': 20}  c = C([Point(0, 0), Point(10, 4)]) assert asdict(c) == {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} 

So if you want recursive dataclass dictification, use asdict. If you don't want it, then all the overhead that goes into providing it is wasted. Particularly, if you use asdict, then changing the implementation of contained objects to use dataclass will change the result of asdict on outer objects.

Aside from that, asdict builds a new dict, while __dict__ simply accesses the object's attribute dict directly. The return value of asdict will not be affected by reassignment of the original object's fields. Also, asdict uses fields, so if you add attributes to a dataclass instance that don't correspond to declared fields, asdict won't include them.

Finally, the docs don't mention it at all, but asdict will call deepcopy on everything that isn't a dataclass object, dict, list, or tuple:

else:     return copy.deepcopy(obj)

(Dataclass objects, dicts, lists, and tuples go through the recursive logic, which also builds a copy, just with recursive dictification applied.)

deepcopy is really expensive on its own, and the lack of any memo handling means that asdict is likely to create multiple copies of shared objects in nontrivial object graphs. Watch out for that:

>>> from dataclasses import dataclass, asdict >>> @dataclass ... class Foo: ...     x: object ...     y: object ...  >>> a = object() >>> b = Foo(a, a) >>> c = asdict(b) >>> b.x is b.y True >>> c['x'] is c['y'] False >>> c['x'] is b.x False

183

answered Jan 07 '23 11:01

user2357112 supports Monica

Related questions
                            
                                The ordinal 242 could not be located in the dynamic link library Anaconda3\Library\bin\mkl_intel_thread.dll
                            
                                "Computed" property in Typescript
                            
                                run-p: command not found
                            
                                How do I link multiple activities in android navigation editor?
                            
                                How to set '-Xuse-experimental=kotlin.experimental' in IntelliJ
                            
                                Union type as key in interface?
                            
                                ansible returns with "Failed to import the required Python library (Docker SDK for Python: docker (Python >= 2.7) or docker-py (Python 2.6))
                            
                                Python: FastAPI error 422 with post request
                            
                                Can't Find certain extensions in CODE-OSS(Open source variant of Visual Studio Code)
                            
                                What is the best free test tracking software? [closed]
                            
                                `active' flag or not?
                            
                                Free Bug Tracker in .NET [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is `dataclasses.asdict(obj)` > 10x slower than `obj.dict()`

Tags:

ostrokach

People also ask

1 Answers

user2357112 supports Monica

Recent Activity

Donate For Us

Why is `dataclasses.asdict(obj)` > 10x slower than `obj.__dict__()`

Tags:

ostrokach

People also ask

1 Answers

user2357112 supports Monica

Related questions

Recent Activity

Donate For Us

Why is `dataclasses.asdict(obj)` > 10x slower than `obj.dict()`