I'm trying to use the new python dataclasses to create some mix-in classes (already as I write this I think it sounds like a rash idea), and I'm having some issues. Behold the example below:
from dataclasses import dataclass
@dataclass
class NamedObj:
name: str
def __post_init__(self):
print("NamedObj __post_init__")
self.name = "Name: " + self.name
@dataclass
class NumberedObj:
number: int = 0
def __post_init__(self):
print("NumberedObj __post_init__")
self.number += 1
@dataclass
class NamedAndNumbered(NumberedObj, NamedObj):
def __post_init__(self):
super().__post_init__()
print("NamedAndNumbered __post_init__")
If I then try:
nandn = NamedAndNumbered('n_and_n')
print(nandn.name)
print(nandn.number)
I get
NumberedObj __post_init__
NamedAndNumbered __post_init__
n_and_n
1
Suggesting it has run __post_init__
for NamedObj
, but not for NumberedObj
.
What I would like is to have NamedAndNumbered run __post_init__
for both of its mix-in classes, Named and Numbered. One might think that it could be done if NamedAndNumbered
had a __post_init__
like this:
def __post_init__(self):
super(NamedObj, self).__post_init__()
super(NumberedObj, self).__post_init__()
print("NamedAndNumbered __post_init__")
But this just gives me an error AttributeError: 'super' object has no attribute '__post_init__'
when I try to call NamedObj.__post_init__()
.
At this point I'm not entirely sure if this is a bug/feature with dataclasses or something to do with my probably-flawed understanding of Python's approach to inheritance. Could anyone lend a hand?
Method resolution order: In the case of multiple inheritance, a given attribute is first searched in the current class if it's not found then it's searched in the parent classes. The parent classes are searched in a left-right fashion and each class is searched once.
In Python a class can inherit from more than one class. If a class inherits, it has the methods and variables from the parent classes. In essence, it's called multiple inheritance because a class can inherit from multiple classes. This is a concept from object orientated programming.
The Problem with Multiple Inheritance If you allow multiple inheritance then you have to face the fact that you might inherit the same class more than once. In Python as all classes inherit from object, potentially multiple copies of object are inherited whenever multiple inheritance is used.
This:
def __post_init__(self):
super(NamedObj, self).__post_init__()
super(NumberedObj, self).__post_init__()
print("NamedAndNumbered __post_init__")
doesn't do what you think it does. super(cls, obj)
will return a proxy to the class after cls
in type(obj).__mro__
- so, in your case, to object
. And the whole point of cooperative super()
calls is to avoid having to explicitely call each of the parents.
The way cooperative super()
calls are intended to work is, well, by being "cooperative" - IOW, everyone in the mro is supposed to relay the call to the next class (actually, the super
name is a rather sad choice, as it's not about calling "the super class", but about "calling the next class in the mro").
IOW, you want each of your "composable" dataclasses (which are not mixins - mixins only have behaviour) to relay the call, so you can compose them in any order. A first naive implementation would look like:
@dataclass
class NamedObj:
name: str
def __post_init__(self):
super().__post_init__()
print("NamedObj __post_init__")
self.name = "Name: " + self.name
@dataclass
class NumberedObj:
number: int = 0
def __post_init__(self):
super().__post_init__()
print("NumberedObj __post_init__")
self.number += 1
@dataclass
class NamedAndNumbered(NumberedObj, NamedObj):
def __post_init__(self):
super().__post_init__()
print("NamedAndNumbered __post_init__")
BUT this doesn't work, since for the last class in the mro (here NamedObj
), the next class in the mro is the builtin object
class, which doesn't have a __post_init__
method. The solution is simple: just add a base class that defines this method as a noop, and make all your composable dataclasses inherit from it:
class Base(object):
def __post_init__(self):
# just intercept the __post_init__ calls so they
# aren't relayed to `object`
pass
@dataclass
class NamedObj(Base):
name: str
def __post_init__(self):
super().__post_init__()
print("NamedObj __post_init__")
self.name = "Name: " + self.name
@dataclass
class NumberedObj:
number: int = 0
def __post_init__(self):
super().__post_init__()
print("NumberedObj __post_init__")
self.number += 1
@dataclass
class NamedAndNumbered(NumberedObj, NamedObj):
def __post_init__(self):
super().__post_init__()
print("NamedAndNumbered __post_init__")
The problem (most probably) isn't related to dataclass
es. The problem is in Python's method resolution. Calling method on super()
invokes the first found method from parent class in the MRO chain. So to make it work you need to call the methods of parent classes manually:
@dataclass
class NamedAndNumbered(NumberedObj, NamedObj):
def __post_init__(self):
NamedObj.__post_init__(self)
NumberedObj.__post_init__(self)
print("NamedAndNumbered __post_init__")
Another approach (if you really like super()
) could be to continue the MRO chain by calling super()
in all parent classes (but it needs to have a __post_init__
in the chain):
@dataclass
class MixinObj:
def __post_init__(self):
pass
@dataclass
class NamedObj(MixinObj):
name: str
def __post_init__(self):
super().__post_init__()
print("NamedObj __post_init__")
self.name = "Name: " + self.name
@dataclass
class NumberedObj(MixinObj):
number: int = 0
def __post_init__(self):
super().__post_init__()
print("NumberedObj __post_init__")
self.number += 1
@dataclass
class NamedAndNumbered(NumberedObj, NamedObj):
def __post_init__(self):
super().__post_init__()
print("NamedAndNumbered __post_init__")
In both approaches:
>>> nandn = NamedAndNumbered('n_and_n')
NamedObj __post_init__
NumberedObj __post_init__
NamedAndNumbered __post_init__
>>> print(nandn.name)
Name: n_and_n
>>> print(nandn.number)
1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With