Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent re-definition of keys in YAML?

Is there any way to cause yaml.load to raise an exception whenever a given key appears more than once in the same dictionary?

For example, parsing the following YAML would raise an exception, because some_key appears twice:

{
  some_key: 0,
  another_key: 1,
  some_key: 1
}

Actually, the behavior described above corresponds to the simplest policy regarding key redefinitions. A somewhat more elaborate policy could, for example, could specify that only redefinitions that change the value assigned to the key would result in an exception, or could allow setting the level of severity of key-redefinition to "warning" rather than "error". Etc. An ideal answer to this question would be capable of supporting such variants.

like image 466
kjo Avatar asked Dec 18 '15 15:12

kjo


1 Answers

If you want the loader to throw an error, then you should just define your own loader, with a constructor that checks if the key is already in the mapping ¹:

import collections
import ruamel.yaml as yaml

from ruamel.yaml.reader import Reader
from ruamel.yaml.scanner import Scanner
from ruamel.yaml.parser_ import Parser
from ruamel.yaml.composer import Composer
from ruamel.yaml.constructor import Constructor
from ruamel.yaml.resolver import Resolver
from ruamel.yaml.nodes import MappingNode
from ruamel.yaml.compat import PY2, PY3


class MyConstructor(Constructor):
    def construct_mapping(self, node, deep=False):
        if not isinstance(node, MappingNode):
            raise ConstructorError(
                None, None,
                "expected a mapping node, but found %s" % node.id,
                node.start_mark)
        mapping = {}
        for key_node, value_node in node.value:
            # keys can be list -> deep
            key = self.construct_object(key_node, deep=True)
            # lists are not hashable, but tuples are
            if not isinstance(key, collections.Hashable):
                if isinstance(key, list):
                    key = tuple(key)
            if PY2:
                try:
                    hash(key)
                except TypeError as exc:
                    raise ConstructorError(
                        "while constructing a mapping", node.start_mark,
                        "found unacceptable key (%s)" %
                        exc, key_node.start_mark)
            else:
                if not isinstance(key, collections.Hashable):
                    raise ConstructorError(
                        "while constructing a mapping", node.start_mark,
                        "found unhashable key", key_node.start_mark)

            value = self.construct_object(value_node, deep=deep)
            # next two lines differ from original
            if key in mapping:
                raise KeyError
            mapping[key] = value
        return mapping


class MyLoader(Reader, Scanner, Parser, Composer, MyConstructor, Resolver):
    def __init__(self, stream):
        Reader.__init__(self, stream)
        Scanner.__init__(self)
        Parser.__init__(self)
        Composer.__init__(self)
        MyConstructor.__init__(self)
        Resolver.__init__(self)



yaml_str = """\
some_key: 0,
another_key: 1,
some_key: 1
"""

data = yaml.load(yaml_str, Loader=MyLoader)
print(data)

and that throws a KeyError.

Please note that the curly braces you use in your example are unnecessary.

I am not sure if this will work with merge keys.


¹ This was done using ruamel.yaml of which I am the author. ruamel.yaml an enhanced version of PyYAML, and the loader code for the latter should be similar.

like image 166
Anthon Avatar answered Oct 26 '22 17:10

Anthon