I ran into a strange issue while trying to use a dataclass together with a property.
I have it down to a minumum to reproduce it:
import dataclasses
@dataclasses.dataclass
class FileObject:
_uploaded_by: str = dataclasses.field(default=None, init=False)
uploaded_by: str = None
def save(self):
print(self.uploaded_by)
@property
def uploaded_by(self):
return self._uploaded_by
@uploaded_by.setter
def uploaded_by(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
p = FileObject()
p.save()
This outputs:
Setter Called with Value <property object at 0x7faeb00150b0>
<property object at 0x7faeb00150b0>
I would expect to get None instead of
Am I doing something wrong here or have I stumbled across a bug?
After reading @juanpa.arrivillaga answer I thought that making uploaded_by and InitVar might fix the issue, but it still return a property object. I think it is because of the this that he said:
the datalcass machinery interprets any assignment to a type-annotated variable in the class body as the default value to the created
__init__
.
The only option I can find that works with the default value is to remove the uploadedby from the dataclass defintion and write an actual __init__
. That has an unfortunate side effect of requiring you to write an __init__
for the dataclass manually which negates some of the value of using a dataclass. Here is what I did:
import dataclasses
@dataclasses.dataclass
class FileObject:
_uploaded_by: str = dataclasses.field(default=None, init=False)
uploaded_by: dataclasses.InitVar=None
other_attrs: str = None
def __init__(self, uploaded_by=None, other_attrs=None):
self._uploaded_by = uploaded_by
self.other_attrs = other_attrs
def save(self):
print("Uploaded by: ", self.uploaded_by)
print("Other Attrs: ", self.other_attrs)
@property
def uploaded_by(self):
if not self._uploaded_by:
print("Doing expensive logic that should not be repeated")
return self._uploaded_by
p = FileObject(other_attrs="More Data")
p.save()
p2 = FileObject(uploaded_by='Already Computed', other_attrs="More Data")
p2.save()
Which outputs:
Doing expensive logic that should not be repeated
Uploaded by: None
Other Attrs: More Data
Uploaded by: Already Computed
Other Attrs: More Data
The negatives of doing this:
__init__
(My actual use case has about
20 attrs) __repr__
, but it is there
in _uploaded_by So it's really not a fix for the issue
I have filed a bug on the Python Bug Tracker: https://bugs.python.org/issue39247
So, unfortunately, the @property
syntax is always interpreted as an assignment to uploaded_by
(since, well, it is). The dataclass
machinery is interpreting that as a default value, hence why it is passing the property object! It is equivalent to this:
In [11]: import dataclasses
...:
...: @dataclasses.dataclass
...: class FileObject:
...: uploaded_by: str
...: _uploaded_by: str = dataclasses.field(repr=False, init=False)
...: def save(self):
...: print(self.uploaded_by)
...:
...: def _get_uploaded_by(self):
...: return self._uploaded_by
...:
...: def _set_uploaded_by(self, uploaded_by):
...: print('Setter Called with Value ', uploaded_by)
...: self._uploaded_by = uploaded_by
...: uploaded_by = property(_get_uploaded_by, _set_uploaded_by)
...: p = FileObject()
...: p.save()
Setter Called with Value <property object at 0x10761e7d0>
<property object at 0x10761e7d0>
Which is essentially acting like this:
In [13]: @dataclasses.dataclass
...: class Foo:
...: bar:int = 1
...: bar = 2
...:
In [14]: Foo()
Out[14]: Foo(bar=2)
I don't think there is a clean way around this, and perhaps it could be considered a bug, but really, not sure what the solution should be, because essentially, the datalcass machinery interprets any assignment to a type-annotated variable in the class body as the default value to the created __init__
. You could perhaps either special-case the @property
syntax, or maybe just the property
object itself, so at least the behavior for @property
and x = property(set_x, get_x)
would be consistent...
To be clear, the following sort of works:
In [22]: import dataclasses
...:
...: @dataclasses.dataclass
...: class FileObject:
...: uploaded_by: str
...: _uploaded_by: str = dataclasses.field(repr=False, init=False)
...: @property
...: def uploaded_by(self):
...: return self._uploaded_by
...: @uploaded_by.setter
...: def uploaded_by(self, uploaded_by):
...: print('Setter Called with Value ', uploaded_by)
...: self._uploaded_by = uploaded_by
...:
...: p = FileObject(None)
...: print(p.uploaded_by)
Setter Called with Value None
None
In [23]: FileObject()
Setter Called with Value <property object at 0x1086debf0>
Out[23]: FileObject(uploaded_by=<property object at 0x1086debf0>)
But notice, you cannot set a useful default value! It will always take the property... Even worse, IMO, if you don't want a default value it will always create one!
This should have been obvious, but you can just set the property object on the class.
import dataclasses
import typing
@dataclasses.dataclass
class FileObject:
uploaded_by:typing.Optional[str]=None
def _uploaded_by_getter(self):
return self._uploaded_by
def _uploaded_by_setter(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
FileObject.uploaded_by = property(
FileObject._uploaded_by_getter,
FileObject._uploaded_by_setter
)
p = FileObject()
print(p)
print(p.uploaded_by)
The alternative take on @juanpa.arrivillaga solution of setting properties, which may look a tad more object-oriented, initially proposed at python-list by Peter Otten
import dataclasses
from typing import Optional
@dataclasses.dataclass
class FileObject:
uploaded_by: Optional[str] = None
class FileObjectExpensive(FileObject):
@property
def uploaded_by(self):
return self._uploaded_by
@uploaded_by.setter
def uploaded_by(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
def save(self):
print(self.uploaded_by)
p = FileObjectExpensive()
p.save()
p2 = FileObjectExpensive(uploaded_by='Already Computed')
p2.save()
This outputs:
Setter Called with Value None
None
Setter Called with Value Already Computed
Already Computed
To me this approach, while not being perfect in terms of removing boilerplate, has a little more readability and explicitness in the separation of the pure data container and behaviour on that data. And it keeps all variables' and properties' names the same, so readability seems to be the same.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With