Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to preserve YAML block structure when dumping a parsed document?

We use PyYAML to prep config files for different environments. But our YAML blocks lose integrity.

Give input.yml ...

pubkey: |
    -----BEGIN PUBLIC KEY-----
    MIGfMA0GCSq7OPxRrQEBAQUAA4GNADCBiQKBgQCvRVUKp6pr4qBEnE9lviuyfiNq
    QtG/OCyBDXL4Bh3FmUzfNI+Z4Bh3FmUx+z2n0FCv/4BpgHTDl8D95NPopWVo1RH2
    UfhyMd6dQ/x9T5m+y38JMzmSVAk+Fqu8ya18+yQVOEyEIx3Gxpsgegow33gcxfjK
    EsUgJHXcpw7OPxRrCQIDAQAB
    -----END PUBLIC KEY-----

... executing this program using python3 ...

import yaml

with open('input.yml', mode='r') as f:
    parsed = yaml.safe_load(f)

with open('output.yml', mode='w') as f:
    yaml.dump(parsed, f)

... produces this output.yml ...

pubkey: '-----BEGIN PUBLIC KEY-----

    MIGfMA0GCSq7OPxRrQEBAQUAA4GNADCBiQKBgQCvRVUKp6pr4qBEnE9lviuyfiNq

    QtG/OCyBDXL4Bh3FmUzfNI+Z4Bh3FmUx+z2n0FCv/4BpgHTDl8D95NPopWVo1RH2

    UfhyMd6dQ/x9T5m+y38JMzmSVAk+Fqu8ya18+yQVOEyEIx3Gxpsgegow33gcxfjK

    EsUgJHXcpw7OPxRrCQIDAQAB

    -----END PUBLIC KEY-----

    '

Is it possible to preserve the structure of my block using PyYAML?

like image 445
Chris Betti Avatar asked Jan 05 '16 23:01

Chris Betti


People also ask

What does YAML dump return?

dump will write the produced YAML document into the file. Otherwise, yaml. dump returns the produced document.

Is PyYAML same as YAML?

YAML is a data serialization format designed for human readability and interaction with scripting languages. PyYAML is a YAML parser and emitter for Python. PyYAML features a complete YAML 1.1 parser, Unicode support, pickle support, capable extension API, and sensible error messages.

What is YAML parser?

YAML (Yet Another Markup Language) is a format for serializing data in a text file. It is similar to other textual data formats such as JSON and XML. YAML is widely used for storing configuration data for software applications, build systems, and deployment platforms.

Can PyYAML parse JSON?

It is often used for configuration files, but can also be used for data exchange. The most used python YAML parser is PyYAML, a library that allows you to load, parse, and write YAML, much like Python's JSON library helps you to work with JSON.


1 Answers

Yes that is possible with pyyaml, but you do have to provide your own enhanced versions of at least the Scanner, Parser and Constructor that are used by safe_load, the Emitter, Serializer and Representer used by dump, and by providing a specialized string-like class that keeps information about it's original formatting.

This is part of what was added to ruamel.yaml (disclaimer: I am the author of that package) as it was derived from PyYAML. Using ruamel.yaml the prefefred way of doing this is:

import sys
import ruamel.yaml

yaml_str = """\
pubkey: |
    -----BEGIN PUBLIC KEY-----
    MIGfMA0GCSq7OPxRrQEBAQUAA4GNADCBiQKBgQCvRVUKp6pr4qBEnE9lviuyfiNq
    QtG/OCyBDXL4Bh3FmUzfNI+Z4Bh3FmUx+z2n0FCv/4BpgHTDl8D95NPopWVo1RH2
    UfhyMd6dQ/x9T5m+y38JMzmSVAk+Fqu8ya18+yQVOEyEIx3Gxpsgegow33gcxfjK
    EsUgJHXcpw7OPxRrCQIDAQAB
    -----END PUBLIC KEY-----
"""
yaml = ruamel.yaml.YAML()  # defaults to round-trip
yaml.indent(mapping=4)
data = yaml.load(yaml_str)
yaml.dump(data, sys.stdout)

Or the older more PyYAML like style (which has some restrictions in options that you can set)

import sys
import ruamel.yaml as yaml

yaml_str = """\
pubkey: |
    -----BEGIN PUBLIC KEY-----
    MIGfMA0GCSq7OPxRrQEBAQUAA4GNADCBiQKBgQCvRVUKp6pr4qBEnE9lviuyfiNq
    QtG/OCyBDXL4Bh3FmUzfNI+Z4Bh3FmUx+z2n0FCv/4BpgHTDl8D95NPopWVo1RH2
    UfhyMd6dQ/x9T5m+y38JMzmSVAk+Fqu8ya18+yQVOEyEIx3Gxpsgegow33gcxfjK
    EsUgJHXcpw7OPxRrCQIDAQAB
    -----END PUBLIC KEY-----
"""

data = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
yaml.dump(data, sys.stdout, Dumper=yaml.RoundTripDumper, indent=4)

Both of which give you:

pubkey: |
    -----BEGIN PUBLIC KEY-----
    MIGfMA0GCSq7OPxRrQEBAQUAA4GNADCBiQKBgQCvRVUKp6pr4qBEnE9lviuyfiNq
    QtG/OCyBDXL4Bh3FmUzfNI+Z4Bh3FmUx+z2n0FCv/4BpgHTDl8D95NPopWVo1RH2
    UfhyMd6dQ/x9T5m+y38JMzmSVAk+Fqu8ya18+yQVOEyEIx3Gxpsgegow33gcxfjK
    EsUgJHXcpw7OPxRrCQIDAQAB
    -----END PUBLIC KEY-----

at least with Python 2.7 and 3.5+.

The indent=4 is necessary as the RoundTripDumper defaults to two spaces indent and the original indent of a file is not preserved (not doing so eases re-indenting a YAML file).

If you cannot switch to ruamel.yaml you should be able to use its source to extract all the changes needed, but if you can you can also use its other features like comment and merge key name preservation.

like image 129
Anthon Avatar answered Sep 19 '22 16:09

Anthon