It was decided to remove direct support for __slots__
from dataclasses for Python 3.7.
Despite this, __slots__
can still be used with dataclasses:
from dataclasses import dataclass @dataclass class C(): __slots__ = "x" x: int
However, because of the way __slots__
works it isn't possible to assign a default value to a dataclass field:
from dataclasses import dataclass @dataclass class C(): __slots__ = "x" x: int = 1
This results in an error:
Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: 'x' in __slots__ conflicts with class variable
How can __slots__
and default dataclass
fields be made to work together?
The __slots__ declaration allows us to explicitly declare data members, causes Python to reserve space for them in memory, and prevents the creation of __dict__ and __weakref__ attributes. It also prevents the creation of any variables that aren't declared in __slots__.
Slots in Python is a special mechanism that is used to reduce memory of the objects. In Python, all the objects use a dynamic dictionary for adding an attribute. Slots is a static type method in this no dynamic dictionary are required for allocating attribute.
2021 UPDATE: direct support for __slots__
is added to python 3.10. I am leaving this answer for posterity and won't be updating it.
The problem is not unique to dataclasses. ANY conflicting class attribute will stomp all over a slot:
>>> class Failure: ... __slots__ = tuple("xyz") ... x=1 ... Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: 'x' in __slots__ conflicts with class variable
This is simply how slots work. The error happens because __slots__
creates a class-level descriptor object for each slot name:
>>> class Success: ... __slots__ = tuple("xyz") ... >>> >>> type(Success.x) <class 'member_descriptor'>
In order to prevent this conflicting variable name error, the class namespace must be altered before the class object is instantiated such that there are not two objects competing for the same member name in the class:
For this reason, an __init_subclass__
method on a parent class will not be sufficient, nor will a class decorator, because in both cases the class object has already been created by the time these functions have received the class to alter it.
Until such time as the slots machinery is altered to allow more flexibility, or the language itself provides an opportunity to alter the class namespace before the class object is instantiated, our only choice is to use a metaclass.
Any metaclass written to solve this problem must, at minimum:
__dict__
(so the dataclass
machinery can find them)dataclass
decorator__dict__
slot)To say the least, this is an extremely complicated endeavor. It would be easier to define the class like the following- without a default value so that the conflict doesn't occur at all- and then add a default value afterward.
The unaltered dataclass would look like this:
@dataclass class C: __slots__ = "x" x: int
The alteration is straightforward. Change the __init__
signature to reflect the desired default value, and then change the __dataclass_fields__
to reflect the presence of a default value.
from functools import wraps def change_init_signature(init): @wraps(init) def __init__(self, x=1): init(self,x) return __init__ C.__init__ = change_init_signature(C.__init__) C.__dataclass_fields__["x"].default = 1
Test:
>>> C() C(x=1) >>> C(2) C(x=2) >>> C.x <member 'x' of 'C' objects> >>> vars(C()) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: vars() argument must have __dict__ attribute
It works!
setmember
decoratorWith some effort, a so-called setmember
decorator could be employed to automatically alter the class in the manner above. This would require deviating from the dataclasses API in order to define the default value in a location other than inside the class body, perhaps something like:
@setmember(x=field(default=1)) @dataclass class C: __slots__="x" x: int
The same thing could also be accomplished through a __init_subclass__
method on a parent class:
class SlottedDataclass: def __init_subclass__(cls, **kwargs): cls.__init_subclass__() # make the class changes here class C(SlottedDataclass, x=field(default=1)): __slots__ = "x" x: int
Another possibility, as mentioned above, would be for the python language to alter the slots machinery to allow more flexibility. One way of doing this might be to change the slots descriptor itself to store class level data at the time of class definition.
This could be done, perhaps, by supplying a dict
as the __slots__
argument (see below). The class-level data (1 for x, 2 for y) could just be stored on the descriptor itself for retrieval later:
class C: __slots__ = {"x": 1, "y": 2} assert C.x.value == 1 assert C.y.value == y
One difficulty: it may be desired to only have a slot_member.value
present on some slots and not others. This could be accommodated by importing a null-slot factory from a new slottools
library:
from slottools import nullslot class C: __slots__ = {"x": 1, "y": 2, "z": nullslot()} assert not hasattr(C.z, "value")
The style of code suggested above would be a deviation from the dataclasses API. However, the slots machinery itself could even be altered to allow for this style of code, with accommodation of the dataclasses API specifically in mind:
class C: __slots__ = "x", "y", "z" x = 1 # 1 is stored on C.x.value y = 2 # 2 is stored on C.y.value assert C.x.value == 1 assert C.y.value == y assert not hasattr(C.z, "value")
The other possibility is altering/preparing (synonymous with the __prepare__
method of a metaclass) the class namespace.
Currently, there is no opportunity (other than writing a metaclass) to write code that alters the class namespace before the class object is instantiated, and the slots machinery goes to work. This could be changed by creating a hook for preparing the class namespace beforehand, and making it so that an error complaining about the conflicting names is only produced after that hook has been run.
This so-called __prepare_slots__
hook could look something like this, which I think is not too bad:
from dataclasses import dataclass, prepare_slots @dataclass class C: __slots__ = ('x',) __prepare_slots__ = prepare_slots x: int = field(default=1)
The dataclasses.prepare_slots
function would simply be a function-- similar to the __prepare__
method-- that receives the class namespace and alters it before the class is created. For this case in particular, the default dataclass field values would be stored in some other convenient place so that they can be retrieved after the slot descriptor objects have been created.
* Note that the default field value conflicting with the slot might also be created by the dataclass machinery if dataclasses.field
is being used.
As noted already in the answers, data classes from dataclasses cannot generate slots for the simple reason that slots must be defined before a class is created.
In fact, the PEP for data classes explicitly mentions this:
At least for the initial release,
__slots__
will not be supported.__slots__
needs to be added at class creation time. The Data Class decorator is called after the class is created, so in order to add__slots__
the decorator would have to create a new class, set__slots__
, and return it. Because this behavior is somewhat surprising, the initial version of Data Classes will not support automatically setting__slots__
.
I wanted to use slots because I needed to initialise many, many data class instances in another project. I ended up writing my own own alternative implementation of data classes which supports this, among a few extra features: dataclassy.
dataclassy uses a metaclass approach which has numerous advantages - it enables decorator inheritance, considerably reduced code complexity and of course, the generation of slots. With dataclassy the following is possible:
from dataclassy import dataclass @dataclass(slots=True) class Pet: name: str age: int species: str fluffy: bool = True
Printing Pet.__slots__
outputs the expected {'name', 'age', 'species', 'fluffy'}
, instances have no __dict__
attribute and the overall memory footprint of the object is therefore lower. These observations indicate that __slots__
has been successfully generated and is effective. Plus, as evidenced, default values work just fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With