Is there a way to substitute string in YAML. For example, I would like to define sub
once and use it throughout the YAML file.
sub: ['a', 'b', 'c']
command:
params:
cmd1:
type: string
enum : # Get the list defined in 'sub'
description: Exclude commands from the test list.
cmd2:
type: string
enum: # Get the list defined in 'sub'
You can do it through yaml. add_implicit_resolver and yaml. add_constructor . In the code below, you are defining a resolver that will match on ${ env variable } in the YAML value and calling the function path_constructor to look up the environment variable.
Concatenating strings and/or variablesThe standard YAML format does not support the notion of string concatenation. Since concatenating string values is a common pattern in easyconfig files, the EasyBuild framework defines the ! join operator to support this.
Quotes are required when the string contains special or reserved characters. The double-quoted style provides a way to express arbitrary strings, by using \ to escape characters and sequences. For instance, it is very useful when you need to embed a \n or a Unicode character in a string.
You cannot really substitute string values in YAML, as in replacing a substring of some string with another substring¹. YAML does however have a possibility to mark a node (in your case the list ['a', 'b', 'c'] with an anchor and reuse that as an alias node.
Anchors take the form &some_id
and are inserted before a node and alias nodes are specified by *some_id
(instead of a node).
This is not the same as substitution on the string level, because during parsing of the YAML file the reference can be preserved. As is the case when loading YAML in Python for any anchors on collection types (i.e. not when using anchors on scalars):
import sys
import ruamel.yaml as yaml
yaml_str = """\
sub: &sub0 [a, b, c]
command:
params:
cmd1:
type: string
# Get the list defined in 'sub'
enum : *sub0
description: Exclude commands from the test list.
cmd2:
type: string
# Get the list defined in 'sub'
enum: *sub0
"""
data1 = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
# the loaded elements point to the same list
assert data1['sub'] is data1['command']['params']['cmd1']['enum']
# change in cmd2
data1['command']['params']['cmd2']['enum'][3] = 'X'
yaml.dump(data1, sys.stdout, Dumper=yaml.RoundTripDumper, indent=4)
This will output:
sub: &sub0 [a, X, c]
command:
params:
cmd1:
type: string
# Get the list defined in 'sub'
enum: *sub0
description: Exclude commands from the test list.
cmd2:
type: string
# Get the list defined in 'sub'
enum: *sub0
please note that the original anchor name is preserved² in ruamel.yaml.
If you don't want the anchors and aliases in your output you can override the ignore_aliases
method in the RoundTripRepresenter
subclass of RoundTripDumper
(that method takes two arguments, but using lambda *args: ....
you don't have to know about that):
dumper = yaml.RoundTripDumper
dumper.ignore_aliases = lambda *args : True
yaml.dump(data1, sys.stdout, Dumper=dumper, indent=4)
Which gives:
sub: [a, X, c]
command:
params:
cmd1:
type: string
# Get the list defined in 'sub'
enum: [a, X, c]
description: Exclude commands from the test list.
cmd2:
type: string
# Get the list defined in 'sub'
enum: [a, X, c]
And this trick can be used to read in the YAML file as if you had done string substitution, by re-reading the material you dumped while ignoring the aliases:
data2 = yaml.load(yaml.dump(yaml.load(yaml_str, Loader=yaml.RoundTripLoader),
Dumper=dumper, indent=4), Loader=yaml.RoundTripLoader)
# these are lists with the same value
assert data2['sub'] == data2['command']['params']['cmd1']['enum']
# but the loaded elements do not point to the same list
assert data2['sub'] is not data2['command']['params']['cmd1']['enum']
data2['command']['params']['cmd2']['enum'][5] = 'X'
yaml.dump(data2, sys.stdout, Dumper=yaml.RoundTripDumper, indent=4)
And now only one 'b'
is changed into 'X'
:
sub: [a, b, c]
command:
params:
cmd1:
type: string
# Get the list defined in 'sub'
enum: [a, b, c]
description: Exclude commands from the test list.
cmd2:
type: string
# Get the list defined in 'sub'
enum: [a, X, c]
As indicated above this is only necessary when using anchors/aliases on collection types, not when you use it on a scalar.
¹ Since YAML can create objects, it is however possible to influence
the parser if those objects are created. This answer describes how to do that.
² Preserving the names was originally not available, but was implemented in update of ruamel.yaml
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With