How do I pickle an instance of a frozen dataclass with __slots__
? For example, the following code raises an exception in Python 3.7.0:
import pickle
from dataclasses import dataclass
@dataclass(frozen=True)
class A:
__slots__ = ('a',)
a: int
b = pickle.dumps(A(5))
pickle.loads(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 3, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field 'a'
This works if I remove either the frozen
or the __slots__
. Is this just a bug?
A data class refers to a class that contains only fields and crude methods for accessing them (getters and setters). These are simply containers for data used by other classes. These classes don't contain any additional functionality and can't independently operate on the data that they own.
DataClasses are like normal classes in Python, but they have some basic functions like instantiation, comparing, and printing the classes already implemented. Parameters: init: If true __init__() method will be generated. repr: If true __repr__() method will be generated.
The dataclass() decorator examines the class to find field s. A field is defined as a class variable that has a type annotation. With two exceptions described below, nothing in dataclass() examines the type specified in the variable annotation.
A dataclass can very well have regular instance and class methods. Dataclasses were introduced from Python version 3.7. For Python versions below 3.7, it has to be installed as a library.
The problem comes from pickle
using the __setattr__
method of the instance when setting the state of the slots.
The default __setstate__
is defined in load_build
in _pickle.c
line 6220.
For the items in the state dict, the instance __dict__
is updated directly:
if (PyObject_SetItem(dict, d_key, d_value) < 0)
whereas for the items in the slotstate dict, the instance's __setattr__
is used:
if (PyObject_SetAttr(inst, d_key, d_value) < 0)
Now because the instance is frozen, __setattr__
raises FrozenInstanceError
when loading.
To circumvent this, you can define your own __setstate__
method which will use object.__setattr__
, and not the instance's __setattr__
.
The docs give some sort of warning for this:
There is a tiny performance penalty when using frozen=True:
__init__()
cannot use simple assignment to initialize fields, and must useobject.__setattr__()
.
It may also be good to define __getstate__
as the instance __dict__
is always None
in your case. If you don't, the state
argument of __setstate__
will be a tuple (None, {'a': 5})
, the first value being the value of the instance's __dict__
and the second the slotstate dict.
import pickle
from dataclasses import dataclass
@dataclass(frozen=True)
class A:
__slots__ = ('a',)
a: int
def __getstate__(self):
return dict(
(slot, getattr(self, slot))
for slot in self.__slots__
if hasattr(self, slot)
)
def __setstate__(self, state):
for slot, value in state.items():
object.__setattr__(self, slot, value) # <- use object.__setattr__
b = pickle.dumps(A(5))
pickle.loads(b)
I personally would not call it a bug as the pickling process is designed to be flexible, but there is room for a feature enhancement. A revision of the pickling protocol could fix this in future. Unless I am missing something and aside of the tiny performance penalty, using PyObject_GenericSetattr
for all the slots might be a reasonable fix?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With