Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: YAML dictionary of functions: how to load without converting to strings

I have a YAML config file, which contains a dictionary, like so:

"COLUMN_NAME": column_function

It maps strings to functions (which exist and are supposed to be called).

However, when I load it using yaml, I see that the loaded dictionary now maps strings to strings:

'COLUMN_NAME': 'column_function' 

Now I cannot use it as intended - 'column_function' doesn't point to column_function.

What would be a good way to load my dict so that it maps to my functions? After searching and reading a bit on this issue, I'm very cautious about using eval or something like that, since the config file is user-edited.

I think this thread is about my issue, but I'm not sure on the best way to approach it.

  • getattr and setattr are out, I think, because they operate on instantiated objects, whereas I have a simple script.
  • globals(), vars() and locals() provide me with dicts of variables, and I suppose I should use globals(), according to this.

Should I look up the string in them for each of key-value pairs in my config dict? Is this a good way:

for (key, val) in STRING_DICTIONARY.items():
    try: 
        STRING_DICTIONARY[key] = globals()[val]   
    except KeyError:
        print("The config file specifies a function \"" + val 
               + "\" for column \"" + key 
               + "\". No such function is defined, however. ")
like image 370
Zubo Avatar asked Feb 03 '17 21:02

Zubo


People also ask

What method do you use to parse YAML format into a Python dictionary?

We can read the YAML file using the PyYAML module's yaml. load() function. This function parse and converts a YAML object to a Python dictionary ( dict object). This process is known as Deserializing YAML into a Python.

Is PyYAML same as YAML?

YAML is a data serialization format designed for human readability and interaction with scripting languages. PyYAML is a YAML parser and emitter for the Python programming language.

What is Python PyYAML?

PyYAML is a YAML parser and emitter for Python. PyYAML features a complete YAML 1.1 parser, Unicode support, pickle support, capable extension API, and sensible error messages. PyYAML supports standard YAML tags and provides Python-specific tags that allow to represent an arbitrary Python object.


1 Answers

To lookup a name val and evaluate it in a generic way I would use the following:

def fun_call_by_name(val):
    if '.' in val:
        module_name, fun_name = val.rsplit('.', 1)
        # you should restrict which modules may be loaded here
        assert module_name.startswith('my.')
    else:
        module_name = '__main__'
        fun_name = val
    try:
        __import__(module_name)
    except ImportError as exc:
        raise ConstructorError(
            "while constructing a Python object", mark,
            "cannot find module %r (%s)" % (utf8(module_name), exc), mark)
    module = sys.modules[module_name]
    fun = getattr(module, fun_name)
    return fun()

This is adapted from ruamel.yaml.constructor.py:find_python_name(), used there to create objects from string scalars. If the val that is handed in, contains a dot, it will assume you are looking up a function name in another module.

But I wouldn't magically interpret values from your top-level dictionary. YAML has a tagging mechanism (and for specific tags the find_python_name() method comes into action, to control the type of instances that are created).
If you have any control over how the YAML file looks like, use tags to selectively not create a string, as in this file input.yaml:

COLUMN_NAME: !fun column_function    # tagged
PI_VAL: !fun my.test.pi              # also tagged
ANSWER: forty-two                    # this one has no tag

Assuming a subdirectory my with a file test.py with contents:

import math
def pi():
    return math.pi

You can use:

import sys
import ruamel.yaml

def column_function():
    return 3

def fun_constructor(loader, node):
    val = loader.construct_scalar(node)
    return fun_call_by_name(val)

# add the constructor for the tag !fun
ruamel.yaml.add_constructor('!fun', fun_constructor, Loader=ruamel.yaml.RoundTripLoader)

with open('input.yaml') as fp:
    data = ruamel.yaml.round_trip_load(fp)
assert data['COLUMN_NAME'] == 3
ruamel.yaml.round_trip_dump(data, sys.stdout)

to get:

COLUMN_NAME: 3                       # tagged
PI_VAL: 3.141592653589793            # also tagged
ANSWER: forty-two                    # this one has no tag

If you don't care about dumping data as YAML with comments preserved, you can use SafeLoader and safe_load() instead of RoundTripLoader resp. round_trip_loader().

like image 164
Anthon Avatar answered Oct 16 '22 05:10

Anthon