Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add multiple docs in yaml file | PyYAML

I am working on an object where first python reads YAML, does some changes and then writes them back to file. Loading and updating values part is working fine but when I go to write the file it makes lists rather separate docs.

testing.yaml

apiVersion: v1
data:
  databag1: try this
  databag2: then try this
kind: ConfigMap
metadata:
  name: data bag info
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: data-bag-service
  name: data-bag-tagging

Code block

import yaml
with open("./testing.yaml", "r") as stream:
    deployment_dict= list(yaml.safe_load_all(stream))

print(deployment_dict)
with open("./testing.yaml", "w") as service_config:
    yaml.dump(
        deployment_dict,
        service_config,
        default_flow_style=False
    )

Transformation I am getting: testing.yaml

- apiVersion: v1
  data:
    databag1: try this
    databag2: then try this
  kind: ConfigMap
  metadata:
    name: data bag info
- apiVersion: extensions/v1beta1
  kind: Deployment
  metadata:
    labels:
      app: data-bag-service
    name: data-bag-tagging

How can I achieve the original state with the --- end-of-directive indicators?

like image 543
Ahsan Naseem Avatar asked Nov 05 '18 08:11

Ahsan Naseem


People also ask

What is multi document YAML?

YAML Multi Documents YAML format allows multiple documents to be embedded in a single file. They only have to be separated with a line containing triple-dash separator ---. YAMLJSON.

What is Safe_load in YAML?

safe_load(stream) Parses the given and returns a Python object constructed from the first document in the stream. safe_load recognizes only standard YAML tags and cannot construct an arbitrary Python object.

How do I create a nested YAML file in Python?

Creating a nested YAML file in python is relatively simple. First, you need to create a dictionary containing all the data you want to store in the YAML file. This dictionary contains three keys: 'foo', 'baz', and 'nested'. The value associated with the 'nested' key is a dictionary containing two key-value pairs.


2 Answers

According to the docs:

If you need to dump several YAML documents to a single stream, use the function yaml.dump_all. yaml.dump_all accepts a list or a generator producing

yaml.dump_all(
    deployment_dict,
    service_config,
    default_flow_style=False
)

You still need default_flow_style=False to get the block style output.

Example code:

import yaml


with open("./testing.yaml", "r") as stream:
    d = list(yaml.safe_load_all(stream))

d.append(d[-1])

with open("./testing2.yaml", "w") as stream:
    yaml.dump_all(
        d,
        stream,
        default_flow_style=False
    )

testing2.yaml

apiVersion: v1
data:
  databag1: try this
  databag2: then try this
kind: ConfigMap
metadata:
  name: data bag info
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: data-bag-service
  name: data-bag-tagging
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: data-bag-service
  name: data-bag-tagging
like image 114
Edgar Ramírez Mondragón Avatar answered Sep 20 '22 00:09

Edgar Ramírez Mondragón


PyYAML is not really made for doing these kind of round-trip updates, it drops any comments you might have, and doesn't necessarily preserve the order of the keys of mappings.

I recommend you takea look at ruamel.yaml (disclaimer: I am the author of that package) for several reasons, including, but not limted to:

  • support of YAML 1.2 (but can write/read YAML 1.1 if necessary)
  • preservation of comments, key order, anchor/alias names, float/integer formats
  • finer control over indentation of mappings and lists
  • no need to load all the documents, process them and dump them in one go
  • optional preservation of quotes and/or block style scalars
  • safe loading by default, and a warning if you use the unsafe load in the backwards compatible API
  • many bug fixes


from pathlib import Path
from ruamel.yaml import YAML

path = Path('testing.yaml')
tmp_path = path.with_suffix('.yaml.tmp')


with YAML(output=tmp_path) as yaml:
    # yaml.indent(mapping=4, sequence=4, offset=2)
    # yaml.preserve_quotes = True
    for data in yaml.load_all(path):
        # update data
        yaml.dump(data)

path.unlink()
tmp_path.rename(path)

print(path.read_text(), end='')

which gives:

apiVersion: v1
data:
  databag1: try this
  databag2: then try this
kind: ConfigMap
metadata:
  name: data bag info
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: data-bag-service
  name: data-bag-tagging

Please note that you cannot write and read from the same file as you are processing a document at a time. Hence the temporary file which has the additional advantage, that if you get an error in updating that last document and your program crashes, you are not left with a half-written YAML stream.

like image 21
Anthon Avatar answered Sep 17 '22 00:09

Anthon