I am trying to output and then to parse back from YAML the following
import numpy as np
class MyClass(object):
YAMLTag = '!MyClass'
def __init__(self, name, times, zeros):
self.name = name
self._T = np.array(times)
self._zeros = np.array(zeros)
The YAML file looks like
!MyClass:
name: InstanceId
times: [0.0, 0.25, 0.5, 1.0, 2.0, 5.0, 10.0]
zeros: [0.03, 0.03, 0.04, 0.03, 0.03, 0.02, 0.03]
To write, I have added to the class two methods
def toDict(self):
return {'name' : self.name,
'times' : [float(t) for t in self._T],
'zeros' : [float(t) for t in self._zeros]}
@staticmethod
def ToYAML(dumper, data):
return dumper.represent_dict({data.YAMLTag : data.toDict()})
and to read, the method
@staticmethod
def FromYAML(loader, node):
nodeMap = loader.construct_mapping(node)
return MyClass(name = nodeMap['name'],
times = nodeMap['times'],
zeros = nodeMap['zeros'])
and following YAML Documentation, I added the following snippet in the same Python file myClass.py
:
import yaml
yaml.add_constructor(MyClass.YAMLTag, MyClass.FromYAML)
yaml.add_representer(MyClass, MyClass.ToYAML)
Now, the writing seems to work ok, but reading the YAML, the code
loader.construct_mapping(node)
seems to return the dictionary with empty data:
{'zeros': [], 'name': 'InstanceId', 'times': []}
How should I fix the reader to be able to do this properly? Or perhaps I am not writing something out right? I spent a long time looking at PyYAML documentation and debugging through how the package is implemented but cannot figure out a way to parse out a complicated structure, and the only example I seemed to find has a 1-line class which parses out easily.
Related: YAML parsing and Python
UPDATE
Manually parsing the node as follows worked:
name, times, zeros = None, None, None
for key, value in node.value:
elementName = loader.construct_scalar(key)
if elementName == 'name':
name = loader.construct_scalar(value)
elif elementName == 'times':
times = loader.construct_sequence(value)
elif elementName == 'zeros':
zeros = loader.construct_sequence(value)
else:
raise ValueError('Unexpected YAML key %s' % elementName)
But the question still stands, is there a non-manual way to do this?
We can read the YAML file using the PyYAML module's yaml. load() function. This function parse and converts a YAML object to a Python dictionary ( dict object). This process is known as Deserializing YAML into a Python.
YAML is a data serialization format designed for human readability and interaction with scripting languages. PyYAML is a YAML parser and emitter for the Python programming language.
However, Python lacks built-in support for the YAML data format, commonly used for configuration and serialization, despite clear similarities between the two languages.
Dumping YAML dump accepts the second optional argument, which must be an open text or binary file. In this case, yaml. dump will write the produced YAML document into the file. Otherwise, yaml. dump returns the produced document.
There are multiple problems with your approach, even not taking into account that you should read PEP 8, the style guide for Python code, in particular the part on Method Names and Instance Variables
As you indicate you have looked long at the Python documentation, you cannot have failed to notice that yaml.load()
is unsafe. It is also is almost never necessary to use it, certainly not if you write your own representers and constructors.
You use dumper.represent_dict({data.YAMLTag : data.toDict()})
which dumps an object as a key-value pair. What you want to do, at least if you want to have a tag in your output YAML is: dumper.represent_mapping(data.YAMLTag, data.toDict())
. This will get you output of the form:
!MyClass
name: InstanceId
times: [0.0, 0.25, 0.5, 1.0, 2.0, 5.0, 10.0]
zeros: [0.03, 0.03, 0.04, 0.03, 0.03, 0.02, 0.03]
i.e. a tagged mapping instead of your key-value pair, where the value is a mapping. (And I would have expected the first line to be '!MyClass':
to make sure the scalar that starts with an exclamation mark is not interpreted as a tag).
Constructing a complex object, that are potentially self-referential (directly or indirectly) has to be done in two steps using a generator (the PyYAML code calls this in the correct way for you). In your code you assume that you have all the parameters to create an instance of MyClass
. But if there is self-reference, these parameters have to include that instance itself and it is not created yet. The proper example code in the YAML code base for this is construct_yaml_object()
in constructor.py
:
def construct_yaml_object(self, node, cls):
data = cls.__new__(cls)
yield data
if hasattr(data, '__setstate__'):
state = self.construct_mapping(node, deep=True)
data.__setstate__(state)
else:
state = self.construct_mapping(node)
data.__dict__.update(state)
You don't have to use .__new__()
, but you should take deep=True
into account as explained here
In general it also is useful to have a __repr__()
that allows you to check the object that you load, with something more expressive than <__main__.MyClass object at 0x12345>
The imports:
from __future__ import print_function
import sys
import yaml
from cStringIO import StringIO
import numpy as np
To check the correct workings of self-referential versions I added the self._ref
attribute to the class:
class MyClass(object):
YAMLTag = u'!MyClass'
def __init__(self, name=None, times=[], zeros=[], ref=None):
self.update(name, times, zeros, ref)
def update(self, name, times, zeros, ref):
self.name = name
self._T = np.array(times)
self._zeros = np.array(zeros)
self._ref = ref
def toDict(self):
return dict(name=self.name,
times=self._T.tolist(),
zeros=self._zeros.tolist(),
ref=self._ref,
)
def __repr__(self):
return "{}(name={}, times={}, zeros={})".format(
self.__class__.__name__,
self.name,
self._T.tolist(),
self._zeros.tolist(),
)
def update_self_ref(self, ref):
self._ref = ref
The representer and constructor "methods":
@staticmethod
def to_yaml(dumper, data):
return dumper.represent_mapping(data.YAMLTag, data.toDict())
@staticmethod
def from_yaml(loader, node):
value = MyClass()
yield value
node_map = loader.construct_mapping(node, deep=True)
value.update(**node_map)
yaml.add_representer(MyClass, MyClass.to_yaml, Dumper=yaml.SafeDumper)
yaml.add_constructor(MyClass.YAMLTag, MyClass.from_yaml, Loader=yaml.SafeLoader)
And how to use it:
instance = MyClass('InstanceId',
[0.0, 0.25, 0.5, 1.0, 2.0, 5.0, 10.0],
[0.03, 0.03, 0.04, 0.03, 0.03, 0.02, 0.03])
instance.update_self_ref(instance)
buf = StringIO()
yaml.safe_dump(instance, buf)
yaml_str = buf.getvalue()
print(yaml_str)
data = yaml.safe_load(yaml_str)
print(data)
print(id(data), id(data._ref))
the above combined gives:
&id001 !MyClass
name: InstanceId
ref: *id001
times: [0.0, 0.25, 0.5, 1.0, 2.0, 5.0, 10.0]
zeros: [0.03, 0.03, 0.04, 0.03, 0.03, 0.02, 0.03]
MyClass(name=InstanceId, times=[0.0, 0.25, 0.5, 1.0, 2.0, 5.0, 10.0], zeros=[0.03, 0.03, 0.04, 0.03, 0.03, 0.02, 0.03])
139737236881744 139737236881744
As you can see the id
s of data
and data._ref
are the same after loading.
The above throws an error if you use the simplistic approach in your constructor, by just using loader.construct_mapping(node, deep=True)
Instead of
nodeMap = loader.construct_mapping(node)
try this:
nodeMap = loader.construct_mapping(node, deep=True)
Also, you have a little mistake in your YAML file:
!MyClass:
The colon at the end does not belong there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With