Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I handle recursion in a custom PyYAML constructor?

PyYAML can handle cyclic graphs in regular python objects. For example:

Snippet #1.

class Node: pass
a = Node()
b = Node()
a.child = b
b.child = a
# We now have the cycle a->b->a
serialized_object  = yaml.dump(a)
object = yaml.load(serialized_object)

This code succeeds, so clearly there's some mechanism to prevent infinite recursion when loading the serialized object. How do I harness that when I write my own YAML constructor function?

For example, say Node is a class with transient fields foo and bar, and intransient field child. Only child should make it into the yaml document. I would hope to do this:

Snippet #2.

def representer(dumper, node):
  return dumper.represent_mapping("!node", {"child": node.child})

def constructor(loader, data):
  result = Node()
  mapping = loader.construct_mapping(data)
  result.child = mapping["child"]
  return result

yaml.add_representer(Node, representer)
yaml.add_constructor("!node", constructor)

# Retry object cycle a->b->a from earlier code snippet
serialized_object  = yaml.dump(a)
print serialized_object
object = yaml.load(serialized_object)

But it fails:

&id001 !node
child: !node
  child: *id001

yaml.constructor.ConstructorError: found unconstructable recursive node:
  in "<string>", line 1, column 1:
    &id001 !node

I see why. My constructor function isn't built for recursion. It needs to return the child object before it finishes constructing the parent object, and that fails when the child and parent are the same object.

But clearly PyYAML has graph traversals that solve this problem, because Snippet #1 works. Maybe there's one pass to construct all the objects and a second pass to populate their fields. My question is, how can my custom constructor tie into those mechanisms?

An answer to that question would be ideal. But if the answer is that I can't do this with custom constructors, and there is a less desirable alternative (e.g. mixing the YAMLObject class into my Node class), then that answer would be appreciated too.

like image 531
Travis Wilson Avatar asked Jan 07 '15 18:01

Travis Wilson


People also ask

What does PyYAML do?

PyYAML is a YAML parser and emitter for Python. Using the PyYAML module, we can perform various actions such as reading and writing complex configuration YAML files, serializing and persisting YMAL data. Use it to convert the YAML file into a Python dictionary.

What is YAML constructor?

Constructors and Representers From a high-level, a constructor allows you to take a YAML node and return a class instance; a representer allows you to serialize a class instance into a YAML node; and a tag helps PyYaml know which constructor or representer to call! A tag uses the special character !

What does YAML dump return?

dump will write the produced YAML document into the file. Otherwise, yaml. dump returns the produced document.


1 Answers

For complex types, that might involve recursion (mapping/dict, sequence/list, objects), the constructor cannot create the object in one go. You should therefore yield the constructed object in the constructor() function, and then update any values after that¹:

def constructor(loader, data):
    result = Node()
    yield result
    mapping = loader.construct_mapping(data)
    result.child = mapping["child"]

that gets rid of the error.

¹ I don't think this is documented anywhere, without me looking at py/constructor.py intensively, while upgrading PyYAML to ruamel.yaml, I would not have known how to do this. A typical case of: read the source Luke

like image 174
Anthon Avatar answered Sep 17 '22 00:09

Anthon