Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python yaml.dump bad indentation

I'm executing the following python code:

import yaml   foo = {     'name': 'foo',     'my_list': [{'foo': 'test', 'bar': 'test2'}, {'foo': 'test3', 'bar': 'test4'}],     'hello': 'world' }  print(yaml.dump(foo, default_flow_style=False)) 

but is printing:

hello: world my_list: - bar: test2   foo: test - bar: test4   foo: test3 name: foo 

instead of:

hello: world my_list:   - bar: test2     foo: test   - bar: test4     foo: test3 name: foo 

How can I indent the my_list elements this way?

like image 539
fj123x Avatar asked Aug 03 '14 20:08

fj123x


People also ask

Should YAML lists be indented?

Indentation is meaningful in YAML. Make sure that you use spaces, rather than tab characters, to indent sections. In the default configuration files and in all the examples in the documentation, we use 2 spaces per indentation level. We recommend you do the same.

How do I indent a YAML file in Python?

if you specify yaml. indent(sequence=4) (indentation is counted to the beginning of the sequence element). You can use mapping=4 to also have the mappings values indented. The dump also observes an additional offset=2 setting that can be used to push the dash inwards, within the space defined by sequence .

How do I indent YAML files?

The suggested syntax for YAML files is to use 2 spaces for indentation, but YAML will follow whatever indentation system that the individual file uses. Indentation of two spaces works very well for SLS files given the fact that the data is uniform and not deeply nested.

What is YAML Safe_load?

Loading a YAML Document Safely Using safe_load() safe_load(stream) Parses the given and returns a Python object constructed from the first document in the stream. safe_load recognizes only standard YAML tags and cannot construct an arbitrary Python object.


2 Answers

This ticket suggests the current implementation correctly follows the spec:

The “-”, “?” and “:” characters used to denote block collection entries are perceived by people to be part of the indentation. This is handled on a case-by-case basis by the relevant productions.

On the same thread, there is also this code snippet (modified to fit your example) to get the behavior you are looking for:

import yaml  class MyDumper(yaml.Dumper):      def increase_indent(self, flow=False, indentless=False):         return super(MyDumper, self).increase_indent(flow, False)  foo = {     'name': 'foo',     'my_list': [         {'foo': 'test', 'bar': 'test2'},         {'foo': 'test3', 'bar': 'test4'}],     'hello': 'world', }  print yaml.dump(foo, Dumper=MyDumper, default_flow_style=False) 
like image 102
Jace Browning Avatar answered Sep 19 '22 15:09

Jace Browning


Your output, as shown, is incomplete as print(yaml.dump()) gives you an extra empty line after name: foo. It is also slower and uses more memory than directly streaming to sys.stdout.

You are probably using PyYAML and, apart from only supporting the outdated YAML 1.1 specification, it is very limited in control over the dumped YAML.

I suggest you use ruamel.yaml (disclaimer: I am the author of that package), where you can specify identation separately for mappings and sequences and also indicate how far to offset the dash within the indent before the sequence element:

import sys import ruamel.yaml  foo = {     'name': 'foo',     'my_list': [{'foo': 'test', 'bar': 'test2'}, {'foo': 'test3', 'bar': 'test4'}],     'hello': 'world' }   yaml = ruamel.yaml.YAML() yaml.indent(sequence=4, offset=2) yaml.dump(foo, sys.stdout) 

which gives:

name: foo my_list:   - foo: test     bar: test2   - foo: test3     bar: test4 hello: world 

Please note that the order of the keys is implementation dependent (but can be controlled, as ruamel.yaml can round-trip the above without changes).

like image 44
Anthon Avatar answered Sep 23 '22 15:09

Anthon