Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird Issue when using dataclass and property together

I ran into a strange issue while trying to use a dataclass together with a property.

I have it down to a minumum to reproduce it:

import dataclasses

@dataclasses.dataclass
class FileObject:
    _uploaded_by: str = dataclasses.field(default=None, init=False)
    uploaded_by: str = None

    def save(self):
        print(self.uploaded_by)

    @property
    def uploaded_by(self):
        return self._uploaded_by

    @uploaded_by.setter
    def uploaded_by(self, uploaded_by):
        print('Setter Called with Value ', uploaded_by)
        self._uploaded_by = uploaded_by

p = FileObject()
p.save()

This outputs:

Setter Called with Value  <property object at 0x7faeb00150b0>
<property object at 0x7faeb00150b0>

I would expect to get None instead of

Am I doing something wrong here or have I stumbled across a bug?

After reading @juanpa.arrivillaga answer I thought that making uploaded_by and InitVar might fix the issue, but it still return a property object. I think it is because of the this that he said:

the datalcass machinery interprets any assignment to a type-annotated variable in the class body as the default value to the created __init__.

The only option I can find that works with the default value is to remove the uploadedby from the dataclass defintion and write an actual __init__. That has an unfortunate side effect of requiring you to write an __init__ for the dataclass manually which negates some of the value of using a dataclass. Here is what I did:

import dataclasses

@dataclasses.dataclass
class FileObject:
    _uploaded_by: str = dataclasses.field(default=None, init=False)
    uploaded_by: dataclasses.InitVar=None
    other_attrs: str = None

    def __init__(self, uploaded_by=None, other_attrs=None):
        self._uploaded_by = uploaded_by
        self.other_attrs = other_attrs

    def save(self):
        print("Uploaded by: ", self.uploaded_by)
        print("Other Attrs: ", self.other_attrs)

    @property
    def uploaded_by(self):
        if not self._uploaded_by:
            print("Doing expensive logic that should not be repeated")
        return self._uploaded_by

p = FileObject(other_attrs="More Data")
p.save()

p2 = FileObject(uploaded_by='Already Computed', other_attrs="More Data")
p2.save()

Which outputs:

Doing expensive logic that should not be repeated
Uploaded by:  None
Other Attrs:  More Data
Uploaded by:  Already Computed
Other Attrs:  More Data

The negatives of doing this:

  • You have to write boilerplate __init__ (My actual use case has about 20 attrs)
  • You lose the uploaded_by in the __repr__, but it is there in _uploaded_by
  • Calls to asdict, astuple, dataclasses.replace aren't handled correctly

So it's really not a fix for the issue

I have filed a bug on the Python Bug Tracker: https://bugs.python.org/issue39247

like image 851
Michael Robellard Avatar asked Jan 07 '20 07:01

Michael Robellard


2 Answers

So, unfortunately, the @property syntax is always interpreted as an assignment to uploaded_by (since, well, it is). The dataclass machinery is interpreting that as a default value, hence why it is passing the property object! It is equivalent to this:

In [11]: import dataclasses
    ...:
    ...: @dataclasses.dataclass
    ...: class FileObject:
    ...:     uploaded_by: str
    ...:     _uploaded_by: str = dataclasses.field(repr=False, init=False)
    ...:     def save(self):
    ...:         print(self.uploaded_by)
    ...:
    ...:     def _get_uploaded_by(self):
    ...:         return self._uploaded_by
    ...:
    ...:     def _set_uploaded_by(self, uploaded_by):
    ...:         print('Setter Called with Value ', uploaded_by)
    ...:         self._uploaded_by = uploaded_by
    ...:     uploaded_by = property(_get_uploaded_by, _set_uploaded_by)
    ...: p = FileObject()
    ...: p.save()
Setter Called with Value  <property object at 0x10761e7d0>
<property object at 0x10761e7d0>

Which is essentially acting like this:

In [13]: @dataclasses.dataclass
    ...: class Foo:
    ...:     bar:int = 1
    ...:     bar = 2
    ...:

In [14]: Foo()
Out[14]: Foo(bar=2)

I don't think there is a clean way around this, and perhaps it could be considered a bug, but really, not sure what the solution should be, because essentially, the datalcass machinery interprets any assignment to a type-annotated variable in the class body as the default value to the created __init__. You could perhaps either special-case the @property syntax, or maybe just the property object itself, so at least the behavior for @property and x = property(set_x, get_x) would be consistent...

To be clear, the following sort of works:

In [22]: import dataclasses
    ...:
    ...: @dataclasses.dataclass
    ...: class FileObject:
    ...:     uploaded_by: str
    ...:     _uploaded_by: str = dataclasses.field(repr=False, init=False)
    ...:     @property
    ...:     def uploaded_by(self):
    ...:         return self._uploaded_by
    ...:     @uploaded_by.setter
    ...:     def uploaded_by(self, uploaded_by):
    ...:         print('Setter Called with Value ', uploaded_by)
    ...:         self._uploaded_by = uploaded_by
    ...:
    ...: p = FileObject(None)
    ...: print(p.uploaded_by)
Setter Called with Value  None
None

In [23]: FileObject()
Setter Called with Value  <property object at 0x1086debf0>
Out[23]: FileObject(uploaded_by=<property object at 0x1086debf0>)

But notice, you cannot set a useful default value! It will always take the property... Even worse, IMO, if you don't want a default value it will always create one!

EDIT: Found a potential workaround!

This should have been obvious, but you can just set the property object on the class.

import dataclasses
import typing
@dataclasses.dataclass
class FileObject:
    uploaded_by:typing.Optional[str]=None

    def _uploaded_by_getter(self):
        return self._uploaded_by

    def _uploaded_by_setter(self, uploaded_by):
        print('Setter Called with Value ', uploaded_by)
        self._uploaded_by = uploaded_by

FileObject.uploaded_by = property(
    FileObject._uploaded_by_getter,
    FileObject._uploaded_by_setter
)
p = FileObject()
print(p)
print(p.uploaded_by)
like image 119
juanpa.arrivillaga Avatar answered Oct 14 '22 21:10

juanpa.arrivillaga


The alternative take on @juanpa.arrivillaga solution of setting properties, which may look a tad more object-oriented, initially proposed at python-list by Peter Otten

import dataclasses
from typing import Optional


@dataclasses.dataclass
class FileObject:
    uploaded_by: Optional[str] = None

class FileObjectExpensive(FileObject):
    @property
    def uploaded_by(self):
        return self._uploaded_by

    @uploaded_by.setter
    def uploaded_by(self, uploaded_by):
        print('Setter Called with Value ', uploaded_by)
        self._uploaded_by = uploaded_by

    def save(self):
        print(self.uploaded_by)

p = FileObjectExpensive()
p.save()
p2 = FileObjectExpensive(uploaded_by='Already Computed')
p2.save()

This outputs:

Setter Called with Value  None
None
Setter Called with Value  Already Computed
Already Computed

To me this approach, while not being perfect in terms of removing boilerplate, has a little more readability and explicitness in the separation of the pure data container and behaviour on that data. And it keeps all variables' and properties' names the same, so readability seems to be the same.

like image 2
Ivan Ivanyuk Avatar answered Oct 14 '22 21:10

Ivan Ivanyuk