Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When ruamel.yaml loads @dataclass from string, __post_init__ is not called

Assume I created a @dataclass class Foo, and added a __post_init__ to perform type checking and processing.

When I attempt to yaml.load a !Foo object, __post_init__ is not called.

from dataclasses import dataclass, fields

from ruamel.yaml import yaml_object, YAML


yaml = YAML()


@yaml_object(yaml)
@dataclass
class Foo:
    foo: int
    bar: int

    def __post_init__(self):
        raise Exception
        for field in fields(self):
            value = getattr(self, field.name)
            typ = field.type
            if not isinstance(value, typ):
                raise Exception

s = '''\
!Foo
foo: "foo"
bar: "bar"
'''
yaml.load(s)

How do I perform parameter checking when loading dataclasses via ruamel.yaml?

This behavior occurs in Python 3.7 as well as 3.6 with pip install dataclasses.

like image 641
nyanpasu64 Avatar asked Jul 26 '18 00:07

nyanpasu64


People also ask

What is __ Post_init __ Python?

The __post_init__ method is called just after initialization. In other words, it is called after the object receives values for its fields, such as name , continent , population , and official_lang .

What does yaml dump do?

yaml. dump(data) produces the document as a UTF-8 encoded str object. yaml. dump(data, encoding=('utf-8'|'utf-16-be'|'utf-16-le')) produces a str object in the specified encoding.

How do you write to a Yaml file in Python?

Write YAML File In PythonOpen config.py and add the following lines of code just below the read_yaml method and above the main block of the file. In the write_yaml method, we open a file called toyaml. yml in write mode and use the YAML packages' dump method to write the YAML document to the file.

What is Yaml Safe_load?

Loading a YAML Document Safely Using safe_load() safe_load(stream) Parses the given and returns a Python object constructed from the first document in the stream. safe_load recognizes only standard YAML tags and cannot construct an arbitrary Python object.


2 Answers

I'm not entirely sure if this is the correct workaround...

I can move logic from __post_init__ to __setstate__(state: dict), which gets called by YAML().load().

def __setstate__(self, state):
    self.__dict__.update(state)
    # I could call self.__post_init__(), or alternatively move logic here:
    for field in fields(self):
        value = getattr(self, field.name)
        typ = field.type
        if not isinstance(value, typ):
            raise Exception

YAML().load(s) calls Foo.__setstate__(state) if that method exists, but apparently not __init__ (which calls __post_init__). Is this an intentional design decision?

like image 148
nyanpasu64 Avatar answered Oct 03 '22 23:10

nyanpasu64


The reason why __post_init__ is not called, is because ruamel.yaml (and the PyYAML code in its Constructors), was created long before dataclasses was created.

Of course code for making a call to __post_init_() could be added to ruamel.yaml's Python object constructors, preferably after a test if something was created using @dataclass, as otherwise a non Data-Class class, that happens to have such a method named __post_init_, will all of a sudden have that method called during loading.

If you have no such classes, you can add your own, smarter, constructor to the YAML() instance before first loading/dumping (at which moment the constructor is instantiated) using yaml.Constructor = MyConstructor. But adding a constructor is not as trivial as subclassing the RoundTripConstructor, because all supported node types need to be registered on such a new constructor type.

Most of the time I find it easier to just patch the appropriate method on the RoundTripConstructor:

from dataclasses import dataclass, fields
from ruamel.yaml import yaml_object, YAML, RoundTripConstructor


def my_construct_yaml_object(self, node, cls):
    for data in self.org_construct_yaml_object(node, cls):
      yield data
    # not doing a try-except, in case `__post_init__` does catch the AttributeError
    post_init = getattr(data, '__post_init__', None)
    if post_init:
        post_init()

RoundTripConstructor.org_construct_yaml_object = RoundTripConstructor.construct_yaml_object
RoundTripConstructor.construct_yaml_object = my_construct_yaml_object

yaml = YAML()
yaml.preserve_quotes = True

@yaml_object(yaml)
@dataclass
class Foo:
    foo: int
    bar: int

    def __post_init__(self):
        for field in fields(self):
            value = getattr(self, field.name)
            typ = field.type
            if not isinstance(value, typ):
                raise Exception

s = '''\
!Foo
foo: "foo"
bar: "bar"
'''
d = yaml.load(s)

throws an exception:

Traceback (most recent call last):
  File "try.py", line 36, in <module>
    d = yaml.load(s)
  File "/home/venv/tmp-46489abf428c4cd4/lib/python3.7/site-packages/ruamel/yaml/main.py", line 266, in load
    return constructor.get_single_data()
  File "/home/venv/tmp-46489abf428c4cd4/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 105, in get_single_data
    return self.construct_document(node)
  File "/home/venv/tmp-46489abf428c4cd4/lib/python3.7/site-packages/ruamel/yaml/constructor.py", line 115, in construct_document
    for dummy in generator:
  File "try.py", line 10, in my_construct_yaml_object
    post_init()
  File "try.py", line 29, in __post_init__
    raise Exception
Exception

Please note that the double quotes in your YAML are superfluous, so if you want to preserve these on round-trip you need to do yaml.preserve_quotes = True

like image 34
Anthon Avatar answered Oct 03 '22 23:10

Anthon