I have a list of dictionaries, which I want to serialize:
list_of_dicts = [ { 'key_1': 'value_a', 'key_2': 'value_b'}, { 'key_1': 'value_c', 'key_2': 'value_d'}, ... { 'key_1': 'value_x', 'key_2': 'value_y'} ] yaml.dump(list_of_dicts, file, default_flow_style = False)
produces the following:
- key_1: value_a key_2: value_b - key_1: value_c key_2: value_d (...) - key_1: value_x key_2: value_y
But i'd like to get this:
- key_1: value_a key_2: value_b <-| - key_1: value_c | key_2: value_d | empty lines between blocks (...) | <-| - key_1: value_x key_2: value_y
PyYAML documentation talks about dump()
arguments very briefly and doesn't seem to have anything on this particular subject.
Editing the file manually to add newlines improves readability quite a lot, and the structure still loads just fine afterwards, but I have no idea how to make dump method generate it.
And in general, is there a way to have more control over output formatting besides simple indentation?
dump will write the produced YAML document into the file. Otherwise, yaml. dump returns the produced document.
Technically YAML is a superset of JSON. This means that, in theory at least, a YAML parser can understand JSON, but not necessarily the other way around.
PyYAML is a YAML parser and emitter for Python. PyYAML features a complete YAML 1.1 parser, Unicode support, pickle support, capable extension API, and sensible error messages. PyYAML supports standard YAML tags and provides Python-specific tags that allow to represent an arbitrary Python object.
There's no easy way to do this with the library (Node objects in yaml dumper syntax tree are passive and can't emit this info), so I ended up with
stream = yaml.dump(list_of_dicts, default_flow_style = False) file.write(stream.replace('\n- ', '\n\n- '))
PyYAML documentation only talks about dump()
arguments briefly, because there is not much to say. This kind of control is not provided by PyYAML.
To allow preservation of such empty (and comment) lines in YAML that is loaded, I started the development of the ruamel.yaml
library, a superset of the stalled PyYAML, with YAML 1.2 compatibility, many features added and bugs fixed. With ruamel.yaml
you can do:
import sys import ruamel.yaml yaml_str = """\ - key_1: value_a key_2: value_b - key_1: value_c key_2: value_d - key_1: value_x # a few before this were ellipsed key_2: value_y """ yaml = ruamel.yaml.YAML() data = yaml.load(yaml_str) yaml.dump(data, sys.stdout)
and get the output exactly the same as the input string (including the comment).
You can also build the output that you want from scratch:
import sys import ruamel.yaml yaml = ruamel.yaml.YAML() list_of_dicts = yaml.seq([ { 'key_1': 'value_a', 'key_2': 'value_b'}, { 'key_1': 'value_c', 'key_2': 'value_d'}, { 'key_1': 'value_x', 'key_2': 'value_y'} ]) for idx in range(1, len(list_of_dicts)): list_of_dicts.yaml_set_comment_before_after_key(idx, before='\n') ruamel.yaml.comments.dump_comments(list_of_dicts) yaml.dump(list_of_dicts, sys.stdout)
The conversion using yaml.seq()
is necessary to create an object that allows attachment of the empty-lines through special attributes.
The library also allows preservation/easy-setting of quotes and literal style on strings, format of int (hex, octal, binary) and floats. As well as separate indent specification for mappings and sequences (although not for individual mappings or sequences).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With