I have the following YAML file named input.yaml
:
cities: 1: [0,0] 2: [4,0] 3: [0,4] 4: [4,4] 5: [2,2] 6: [6,2] highways: - [1,2] - [1,3] - [1,5] - [2,4] - [3,4] - [5,4] start: 1 end: 4
I'm loading it using PyYAML and printing the result as follows:
import yaml f = open("input.yaml", "r") data = yaml.load(f) f.close() print(data)
The result is the following data structure:
{ 'cities': { 1: [0, 0] , 2: [4, 0] , 3: [0, 4] , 4: [4, 4] , 5: [2, 2] , 6: [6, 2] } , 'highways': [ [1, 2] , [1, 3] , [1, 5] , [2, 4] , [3, 4] , [5, 4] ] , 'start': 1 , 'end': 4 }
As you can see, each city and highway is represented as a list. However, I want them to be represented as a tuple. Hence, I manually convert them into tuples using comprehensions:
import yaml f = open("input.yaml", "r") data = yaml.load(f) f.close() data["cities"] = {k: tuple(v) for k, v in data["cities"].items()} data["highways"] = [tuple(v) for v in data["highways"]] print(data)
However, this seems like a hack. Is there some way to instruct PyYAML to directly read them as tuples instead of lists?
Support for Python builtin types and mappings of other types onto YAML syntax. Objects of commonly used Python builtin types may be tersely expressed in YamlConfig. Supported types are str, unicode, int, long, float, decimal. Decimal, bool, complex, dict, list and tuple.
Technically YAML is a superset of JSON. This means that, in theory at least, a YAML parser can understand JSON, but not necessarily the other way around.
I wouldn't call what you've done hacky for what you are trying to do. Your alternative approach from my understanding is to make use of python-specific tags in your YAML file so it is represented appropriately when loading the yaml file. However, this requires you modifying your yaml file which, if huge, is probably going to be pretty irritating and not ideal.
Look at the PyYaml doc that further illustrates this. Ultimately you want to place a !!python/tuple
in front of your structure that you want to represented as such. To take your sample data, it would like:
YAML FILE:
cities: 1: !!python/tuple [0,0] 2: !!python/tuple [4,0] 3: !!python/tuple [0,4] 4: !!python/tuple [4,4] 5: !!python/tuple [2,2] 6: !!python/tuple [6,2] highways: - !!python/tuple [1,2] - !!python/tuple [1,3] - !!python/tuple [1,5] - !!python/tuple [2,4] - !!python/tuple [3,4] - !!python/tuple [5,4] start: 1 end: 4
Sample code:
import yaml with open('y.yaml') as f: d = yaml.load(f.read()) print(d)
Which will output:
{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}, 'start': 1, 'end': 4, 'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]}
Depending on where your YAML input comes from your "hack" is a good solution, especially if you would use yaml.safe_load()
instead of the unsafe yaml.load()
. If only the "leaf" sequences in your YAML file need to be tuples you can do ¹:
import pprint import ruamel.yaml from ruamel.yaml.constructor import SafeConstructor def construct_yaml_tuple(self, node): seq = self.construct_sequence(node) # only make "leaf sequences" into tuples, you can add dict # and other types as necessary if seq and isinstance(seq[0], (list, tuple)): return seq return tuple(seq) SafeConstructor.add_constructor( u'tag:yaml.org,2002:seq', construct_yaml_tuple) with open('input.yaml') as fp: data = ruamel.yaml.safe_load(fp) pprint.pprint(data, width=24)
which prints:
{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}, 'end': 4, 'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)], 'start': 1}
if you then need to process more material where sequence need to be "normal" lists again, use:
SafeConstructor.add_constructor( u'tag:yaml.org,2002:seq', SafeConstructor.construct_yaml_seq)
¹ This was done using ruamel.yaml a YAML 1.2 parser, of which I am the author. You should be able to do same with the older PyYAML if you only ever need to support YAML 1.1 and/or cannot upgrade for some reason
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With