Dataclass example:
@dataclass
class StatusElement:
status: str
orderindex: int
color: str
type: str
@dataclass
class List:
id: int
statuses: List[StatusElement]
JSON example:
json = {
"id": "124",
"statuses": [
{
"status": "to do",
"orderindex": 0,
"color": "#d3d3d3",
"type": "open"
}]
}
I can unpack the JSON doing something like this:
object = List(**json)
But I'm not sure how can I also unpack the statuses into a status object and appened to the statuses list of the List object? I'm sure I need to loop over it somehow but not sure how to combine that with unpacking.
Python dataclasses is a great module, but one of the things it doesn't unfortunately handle is parsing a JSON object to a nested dataclass structure.
A few workarounds exist for this:
from_json which converts a JSON string to an List instance with a nested dataclass.pydantic is a popular one that supports this use case.Here is an example using the dataclass-wizard library that works well enough for your use case. It's more lightweight than pydantic and coincidentally also a little faster. It also supports automatic case transforms and type conversions (for example str to annotated int)
Example below:
from dataclasses import dataclass
from typing import List as PyList
from dataclass_wizard import JSONWizard
@dataclass
class List(JSONWizard):
id: int
statuses: PyList['StatusElement']
# on Python 3.9+ you can use the following syntax:
# statuses: list['StatusElement']
@dataclass
class StatusElement:
status: str
order_index: int
color: str
type: str
json = {
"id": "124",
"statuses": [
{
"status": "to do",
"orderIndex": 0,
"color": "#d3d3d3",
"type": "open"
}]
}
object = List.from_dict(json)
print(repr(object))
# List(id=124, statuses=[StatusElement(status='to do', order_index=0, color='#d3d3d3', type='open')])
Disclaimer: I am the creator (and maintainer) of this library.
You can now skip the class inheritance as of the latest release of dataclass-wizard. It's straightforward enough to use it; using the same example from above, but I've removed the JSONWizard usage from it completely. Just remember to ensure you don't import asdict from the dataclasses module, even though I guess that should coincidentally work.
Here's the modified version of the above without class inheritance:
from dataclasses import dataclass
from typing import List as PyList
from dataclass_wizard import fromdict, asdict
@dataclass
class List:
id: int
statuses: PyList['StatusElement']
@dataclass
class StatusElement:
status: str
order_index: int
color: str
type: str
json = {
"id": "124",
"statuses": [
{
"status": "to do",
"orderIndex": 0,
"color": "#d3d3d3",
"type": "open"
}]
}
# De-serialize the JSON dictionary into a `List` instance.
c = fromdict(List, json)
print(c)
# List(id=124, statuses=[StatusElement(status='to do', order_index=0, color='#d3d3d3', type='open')])
# Convert the instance back to a dictionary object that is JSON-serializable.
d = asdict(c)
print(d)
# {'id': 124, 'statuses': [{'status': 'to do', 'orderIndex': 0, 'color': '#d3d3d3', 'type': 'open'}]}
Also, here's a quick performance comparison with dacite. I wasn't aware of this library before, but it's also very easy to use (and there's also no need to inherit from any class). However, from my personal tests - Windows 10 Alienware PC using Python 3.9.1 - dataclass-wizard seemed to perform much better overall on the de-serialization process.
from dataclasses import dataclass
from timeit import timeit
from typing import List
from dacite import from_dict
from dataclass_wizard import JSONWizard, fromdict
data = {
"id": 124,
"statuses": [
{
"status": "to do",
"orderindex": 0,
"color": "#d3d3d3",
"type": "open"
}]
}
@dataclass
class StatusElement:
status: str
orderindex: int
color: str
type: str
@dataclass
class List:
id: int
statuses: List[StatusElement]
class ListWiz(List, JSONWizard):
...
n = 100_000
# 0.37
print('dataclass-wizard: ', timeit('ListWiz.from_dict(data)', number=n, globals=globals()))
# 0.36
print('dataclass-wizard (fromdict): ', timeit('fromdict(List, data)', number=n, globals=globals()))
# 11.2
print('dacite: ', timeit('from_dict(List, data)', number=n, globals=globals()))
lst_wiz1 = ListWiz.from_dict(data)
lst_wiz2 = from_dict(List, data)
lst = from_dict(List, data)
# True
assert lst.__dict__ == lst_wiz1.__dict__ == lst_wiz2.__dict__
I've spent a few hours investigating options for this. There's no native Python functionality to do this, but there are a few third-party packages (writing in November 2022):
marshmallow_dataclass has this functionality (you need not be using marshmallow in any other capacity in your project). It gives good error messages and the package is actively maintained. I used this for a while before hitting what I believe is a bug parsing a large and complex JSON into deeply nested dataclasses, and then had to switch away.dataclass-wizard is easy to use and specifically addresses this use case. It has excellent documentation. One significant disadvantage is that it won't automatically attempt to find the right fit for a given JSON, if trying to match against a union of dataclasses (see https://dataclass-wizard.readthedocs.io/en/latest/common_use_cases/dataclasses_in_union_types.html). Instead it asks you to add a "tag key" to the input JSON, which is a robust solution but may not be possible if you have no control over the input JSON.dataclass-json is similar to dataclass-wizard, and again doesn't attempt to match the correct dataclass within a union.dacite is the option I have settled upon for the time being. It has similar functionality to marshmallow_dataclass, at least for JSON parsing. The error messages are significantly less clear than marshmallow_dataclass, but slightly offsetting this, it's easier to figure out what's wrong if you pdb in at the point that the error occurs - the internals are quite clear and you can experiment to see what's going wrong. According to others it is rather slow, but that's not a problem in my circumstance.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With