The standard library in 3.7 can recursively convert a dataclass into a dict (example from the docs): <pre class="prettyprint"><code>from dataclasses import dataclass, asdict from typing import List @dataclass class Point: x: int y: int @dataclass class C: mylist: List[Point] p = Point(10, 20) assert asdict(p) == {'x': 10, 'y': 20} c = C([Point(0, 0), Point(10, 4)]) tmp = {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]} assert asdict(c) == tmp </code></pre> I am looking for a way to turn a dict back into a dataclass when there is nesting. Something like <code>C(**tmp)</code> only works if the fields of the data class are simple types and not themselves dataclasses. I am familiar with jsonpickle, which however comes with a prominent security warning. <hr> EDIT: Answers have suggested the following libraries: <ul> <li>dacite</li> <li>mashumaro (I used for a while, works well but I quickly ran into tricky corner cases)</li> <li>pydantic (works very well, excellent documentation and fewer corner cases)</li> </ul>

All it takes is a five-liner: <pre class="prettyprint"><code>def dataclass_from_dict(klass, d): try: fieldtypes = {f.name:f.type for f in dataclasses.fields(klass)} return klass(**{f:dataclass_from_dict(fieldtypes[f],d[f]) for f in d}) except: return d # Not a dataclass field </code></pre> Sample usage: <pre class="prettyprint"><code>from dataclasses import dataclass, asdict @dataclass class Point: x: float y: float @dataclass class Line: a: Point b: Point line = Line(Point(1,2), Point(3,4)) assert line == dataclass_from_dict(Line, asdict(line)) </code></pre> Full code, including to/from json, here at gist: https://gist.github.com/gatopeich/1efd3e1e4269e1e98fae9983bb914f22

Python dataclass from a nested dict

Tags:

python

python-3.x

python-dataclasses

The standard library in 3.7 can recursively convert a dataclass into a dict (example from the docs):

from dataclasses import dataclass, asdict
from typing import List

@dataclass
class Point:
     x: int
     y: int

@dataclass
class C:
     mylist: List[Point]

p = Point(10, 20)
assert asdict(p) == {'x': 10, 'y': 20}

c = C([Point(0, 0), Point(10, 4)])
tmp = {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]}
assert asdict(c) == tmp

I am looking for a way to turn a dict back into a dataclass when there is nesting. Something like C(**tmp) only works if the fields of the data class are simple types and not themselves dataclasses. I am familiar with jsonpickle, which however comes with a prominent security warning.

EDIT:

Answers have suggested the following libraries:

dacite
mashumaro (I used for a while, works well but I quickly ran into tricky corner cases)
pydantic (works very well, excellent documentation and fewer corner cases)

381

asked Nov 19 '18 13:11

mbatchkarov

5 Answers

I'm the author of dacite - the tool that simplifies creation of data classes from dictionaries.

This library has only one function from_dict - this is a quick example of usage:

from dataclasses import dataclass
from dacite import from_dict

@dataclass
class User:
    name: str
    age: int
    is_active: bool

data = {
    'name': 'john',
    'age': 30,
    'is_active': True,
}

user = from_dict(data_class=User, data=data)

assert user == User(name='john', age=30, is_active=True)

Moreover dacite supports following features:

nested structures
(basic) types checking
optional fields (i.e. typing.Optional)
unions
collections
values casting and transformation
remapping of fields names

... and it's well tested - 100% code coverage!

To install dacite, simply use pip (or pipenv):

$ pip install dacite

162

answered Oct 06 '22 12:10

Konrad Hałas

Below is the CPython implementation of asdict – or specifically, the internal recursive helper function _asdict_inner that it uses:

# Source: https://github.com/python/cpython/blob/master/Lib/dataclasses.py  def _asdict_inner(obj, dict_factory):     if _is_dataclass_instance(obj):         result = []         for f in fields(obj):             value = _asdict_inner(getattr(obj, f.name), dict_factory)             result.append((f.name, value))         return dict_factory(result)     elif isinstance(obj, tuple) and hasattr(obj, '_fields'):         # [large block of author comments]         return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj])     elif isinstance(obj, (list, tuple)):         # [ditto]         return type(obj)(_asdict_inner(v, dict_factory) for v in obj)     elif isinstance(obj, dict):         return type(obj)((_asdict_inner(k, dict_factory),                           _asdict_inner(v, dict_factory))                          for k, v in obj.items())     else:         return copy.deepcopy(obj)

asdict simply calls the above with some assertions, and dict_factory=dict by default.

How can this be adapted to create an output dictionary with the required type-tagging, as mentioned in the comments?

1. Adding type information

My attempt involved creating a custom return wrapper inheriting from dict:

class TypeDict(dict):     def __init__(self, t, *args, **kwargs):         super(TypeDict, self).__init__(*args, **kwargs)          if not isinstance(t, type):             raise TypeError("t must be a type")          self._type = t      @property     def type(self):         return self._type

Looking at the original code, only the first clause needs to be modified to use this wrapper, as the other clauses only handle containers of dataclass-es:

# only use dict for now; easy to add back later def _todict_inner(obj):     if is_dataclass_instance(obj):         result = []         for f in fields(obj):             value = _todict_inner(getattr(obj, f.name))             result.append((f.name, value))         return TypeDict(type(obj), result)      elif isinstance(obj, tuple) and hasattr(obj, '_fields'):         return type(obj)(*[_todict_inner(v) for v in obj])     elif isinstance(obj, (list, tuple)):         return type(obj)(_todict_inner(v) for v in obj)     elif isinstance(obj, dict):         return type(obj)((_todict_inner(k), _todict_inner(v))                          for k, v in obj.items())     else:         return copy.deepcopy(obj)

Imports:

from dataclasses import dataclass, fields, is_dataclass  # thanks to Patrick Haugh from typing import *  # deepcopy  import copy

Functions used:

# copy of the internal function _is_dataclass_instance def is_dataclass_instance(obj):     return is_dataclass(obj) and not is_dataclass(obj.type)  # the adapted version of asdict def todict(obj):     if not is_dataclass_instance(obj):          raise TypeError("todict() should be called on dataclass instances")     return _todict_inner(obj)

Tests with the example dataclasses:

c = C([Point(0, 0), Point(10, 4)])  print(c) cd = todict(c)  print(cd) # {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]}  print(cd.type) # <class '__main__.C'>

Results are as expected.

2. Converting back to a dataclass

The recursive routine used by asdict can be re-used for the reverse process, with some relatively minor changes:

def _fromdict_inner(obj):     # reconstruct the dataclass using the type tag     if is_dataclass_dict(obj):         result = {}         for name, data in obj.items():             result[name] = _fromdict_inner(data)         return obj.type(**result)      # exactly the same as before (without the tuple clause)     elif isinstance(obj, (list, tuple)):         return type(obj)(_fromdict_inner(v) for v in obj)     elif isinstance(obj, dict):         return type(obj)((_fromdict_inner(k), _fromdict_inner(v))                          for k, v in obj.items())     else:         return copy.deepcopy(obj)

Functions used:

def is_dataclass_dict(obj):     return isinstance(obj, TypeDict)  def fromdict(obj):     if not is_dataclass_dict(obj):         raise TypeError("fromdict() should be called on TypeDict instances")     return _fromdict_inner(obj)

Test:

c = C([Point(0, 0), Point(10, 4)]) cd = todict(c) cf = fromdict(cd)  print(c) # C(mylist=[Point(x=0, y=0), Point(x=10, y=4)])  print(cf) # C(mylist=[Point(x=0, y=0), Point(x=10, y=4)])

Again as expected.

answered Sep 18 '22 16:09

meowgoesthedog

Below is the CPython implementation of asdict – or specifically, the internal recursive helper function _asdict_inner that it uses:

# Source: https://github.com/python/cpython/blob/master/Lib/dataclasses.py

def _asdict_inner(obj, dict_factory):
    if _is_dataclass_instance(obj):
        result = []
        for f in fields(obj):
            value = _asdict_inner(getattr(obj, f.name), dict_factory)
            result.append((f.name, value))
        return dict_factory(result)
    elif isinstance(obj, tuple) and hasattr(obj, '_fields'):
        # [large block of author comments]
        return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj])
    elif isinstance(obj, (list, tuple)):
        # [ditto]
        return type(obj)(_asdict_inner(v, dict_factory) for v in obj)
    elif isinstance(obj, dict):
        return type(obj)((_asdict_inner(k, dict_factory),
                          _asdict_inner(v, dict_factory))
                         for k, v in obj.items())
    else:
        return copy.deepcopy(obj)

asdict simply calls the above with some assertions, and dict_factory=dict by default.

How can this be adapted to create an output dictionary with the required type-tagging, as mentioned in the comments?

1. Adding type information

My attempt involved creating a custom return wrapper inheriting from dict:

class TypeDict(dict):
    def __init__(self, t, *args, **kwargs):
        super(TypeDict, self).__init__(*args, **kwargs)

        if not isinstance(t, type):
            raise TypeError("t must be a type")

        self._type = t

    @property
    def type(self):
        return self._type

Looking at the original code, only the first clause needs to be modified to use this wrapper, as the other clauses only handle containers of dataclass-es:

# only use dict for now; easy to add back later
def _todict_inner(obj):
    if is_dataclass_instance(obj):
        result = []
        for f in fields(obj):
            value = _todict_inner(getattr(obj, f.name))
            result.append((f.name, value))
        return TypeDict(type(obj), result)

    elif isinstance(obj, tuple) and hasattr(obj, '_fields'):
        return type(obj)(*[_todict_inner(v) for v in obj])
    elif isinstance(obj, (list, tuple)):
        return type(obj)(_todict_inner(v) for v in obj)
    elif isinstance(obj, dict):
        return type(obj)((_todict_inner(k), _todict_inner(v))
                         for k, v in obj.items())
    else:
        return copy.deepcopy(obj)

Imports:

from dataclasses import dataclass, fields, is_dataclass

# thanks to Patrick Haugh
from typing import *

# deepcopy 
import copy

Functions used:

# copy of the internal function _is_dataclass_instance
def is_dataclass_instance(obj):
    return is_dataclass(obj) and not is_dataclass(obj.type)

# the adapted version of asdict
def todict(obj):
    if not is_dataclass_instance(obj):
         raise TypeError("todict() should be called on dataclass instances")
    return _todict_inner(obj)

Tests with the example dataclasses:

c = C([Point(0, 0), Point(10, 4)])

print(c)
cd = todict(c)

print(cd)
# {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]}

print(cd.type)
# <class '__main__.C'>

Results are as expected.

2. Converting back to a dataclass

The recursive routine used by asdict can be re-used for the reverse process, with some relatively minor changes:

def _fromdict_inner(obj):
    # reconstruct the dataclass using the type tag
    if is_dataclass_dict(obj):
        result = {}
        for name, data in obj.items():
            result[name] = _fromdict_inner(data)
        return obj.type(**result)

    # exactly the same as before (without the tuple clause)
    elif isinstance(obj, (list, tuple)):
        return type(obj)(_fromdict_inner(v) for v in obj)
    elif isinstance(obj, dict):
        return type(obj)((_fromdict_inner(k), _fromdict_inner(v))
                         for k, v in obj.items())
    else:
        return copy.deepcopy(obj)

Functions used:

def is_dataclass_dict(obj):
    return isinstance(obj, TypeDict)

def fromdict(obj):
    if not is_dataclass_dict(obj):
        raise TypeError("fromdict() should be called on TypeDict instances")
    return _fromdict_inner(obj)

Test:

c = C([Point(0, 0), Point(10, 4)])
cd = todict(c)
cf = fromdict(cd)

print(c)
# C(mylist=[Point(x=0, y=0), Point(x=10, y=4)])

print(cf)
# C(mylist=[Point(x=0, y=0), Point(x=10, y=4)])

Again as expected.

answered Oct 06 '22 12:10

meowgoesthedog

All it takes is a five-liner:

def dataclass_from_dict(klass, d):
    try:
        fieldtypes = {f.name:f.type for f in dataclasses.fields(klass)}
        return klass(**{f:dataclass_from_dict(fieldtypes[f],d[f]) for f in d})
    except:
        return d # Not a dataclass field

Sample usage:

from dataclasses import dataclass, asdict

@dataclass
class Point:
    x: float
    y: float

@dataclass
class Line:
    a: Point
    b: Point

line = Line(Point(1,2), Point(3,4))
assert line == dataclass_from_dict(Line, asdict(line))

Full code, including to/from json, here at gist: https://gist.github.com/gatopeich/1efd3e1e4269e1e98fae9983bb914f22

answered Oct 06 '22 10:10

gatopeich

Using no additional modules, you can make use of the __post_init__ function to automatically convert the dict values to the correct type. This function is called after __init__.

from dataclasses import dataclass, asdict


@dataclass
class Bar:
    fee: str
    far: str

@dataclass
class Foo:
    bar: Bar

    def __post_init__(self):
        if isinstance(self.bar, dict):
            self.bar = Bar(**self.bar)

foo = Foo(bar=Bar(fee="La", far="So"))

d= asdict(foo)
print(d)  # {'bar': {'fee': 'La', 'far': 'So'}}
o = Foo(**d)
print(o)  # Foo(bar=Bar(fee='La', far='So'))

This solution has the added benefit of being able to use non-dataclass objects. As long as its str function can be converted back, it's fair game. For example, it can be used to keep str fields as IP4Address internally.

answered Oct 06 '22 10:10

killjoy

Related questions
                            
                                Quicksort with Python
                            
                                Inherit docstrings in Python class inheritance
                            
                                How to convert Python's .isoformat() string back into datetime object [duplicate]
                            
                                Resolving new pip backtracking runtime issue
                            
                                Seaborn - Why import as sns?
                            
                                Catching KeyboardInterrupt in Python during program shutdown
                            
                                Is there a way to compile a python application into static binary?
                            
                                How to handle both `with open(...)` and `sys.stdout` nicely?
                            
                                matplotlib taking time when being imported
                            
                                Get filename from file pointer [duplicate]
                            
                                How to remove all rows in a numpy.ndarray that contain non-numeric values
                            
                                Dump a list in a pickle file and retrieve it back later [closed]
                            
                                How to read a long multiline string line by line in python
                            
                                Passing a list of kwargs?
                            
                                How to print from Flask @app.route to python console
                            
                                How do I wrap a string in a file in Python?
                            
                                AWS : The config profile (MyName) could not be found
                            
                                Using python's mock patch.object to change the return value of a method called within another method
                            
                                Elegant setup of Python logging in Django
                            
                                What does "while True" mean in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python dataclass from a nested dict

Tags:

python

python-3.x

python-dataclasses

mbatchkarov

People also ask

5 Answers

Konrad Hałas

meowgoesthedog

meowgoesthedog

gatopeich

killjoy

Recent Activity

Donate For Us