Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Controlling Yaml Serialization Order in Python

Tags:

python

yaml

How do you control how the order in which PyYaml outputs key/value pairs when serializing a Python dictionary?

I'm using Yaml as a simple serialization format in a Python script. My Yaml serialized objects represent a sort of "document", so for maximum user-friendliness, I'd like my object's "name" field to appear first in the file. Of course, since the value returned by my object's __getstate__ is a dictionary, and Python dictionaries are unordered, the "name" field will be serialized to a random location in the output.

e.g.

>>> import yaml
>>> class Document(object):
...     def __init__(self, name):
...         self.name = name
...         self.otherstuff = 'blah'
...     def __getstate__(self):
...         return self.__dict__.copy()
... 
>>> doc = Document('obj-20111227')
>>> print yaml.dump(doc, indent=4)
!!python/object:__main__.Document
otherstuff: blah
name: obj-20111227
like image 268
Cerin Avatar asked Dec 28 '11 01:12

Cerin


People also ask

Is YAML serialized?

YAML is a straightforward, machine-parsable data serialization format designed for human readability and interaction, which can be used in conjunction with all programming languages.

Is PyYAML same as YAML?

YAML is a data serialization format designed for human readability and interaction with scripting languages. PyYAML is a YAML parser and emitter for the Python programming language.

How do I dump a YAML file in Python?

Open the empty Python file within the text editor and start to code within it. We add the python path within this code in the first line. The code is initiated with the simple import of the “yaml” repository to use the “yaml” related functions within the code, i.e. “dump()” function.

How do I read a YAML file in Python?

We can read the YAML file using the PyYAML module's yaml. load() function. This function parse and converts a YAML object to a Python dictionary ( dict object). This process is known as Deserializing YAML into a Python.


2 Answers

Took me a few hours of digging through PyYAML docs and tickets, but I eventually discovered this comment that lays out some proof-of-concept code for serializing an OrderedDict as a normal YAML map (but maintaining the order).

e.g. applied to my original code, the solution looks something like:

>>> import yaml
>>> from collections import OrderedDict
>>> def dump_anydict_as_map(anydict):
...     yaml.add_representer(anydict, _represent_dictorder)
... 
>>> def _represent_dictorder( self, data):
...     if isinstance(data, Document):
...         return self.represent_mapping('tag:yaml.org,2002:map', data.__getstate__().items())
...     else:
...         return self.represent_mapping('tag:yaml.org,2002:map', data.items())
... 
>>> class Document(object):
...     def __init__(self, name):
...         self.name = name
...         self.otherstuff = 'blah'
...     def __getstate__(self):
...         d = OrderedDict()
...         d['name'] = self.name
...         d['otherstuff'] = self.otherstuff
...         return d
... 
>>> dump_anydict_as_map(Document)
>>> doc = Document('obj-20111227')
>>> print yaml.dump(doc, indent=4)
!!python/object:__main__.Document
name: obj-20111227
otherstuff: blah
like image 150
Cerin Avatar answered Sep 17 '22 17:09

Cerin


I think the problem is when you dump the data. I looked into the code of PyYaml and there is a optional argument called sort_keys, setting that value to False seems to do the trick.

like image 37
Sutsuj OwO Avatar answered Sep 20 '22 17:09

Sutsuj OwO