Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you keep yaml-cpp parser from stripping out all comments?

Tags:

c++

comments

yaml

I have a project that needs to read a well documented yaml file, modify a couple of values, and write it back out. The trouble is that yaml-cpp completely strips out all comments and "eats" them. The interesting thing is that the YAML::Emitter class allows one to add comments to the output. Is there a way to preserve the comments in the input and write them back in the library that I'm not seeing? Because as it stands right now, I can't see any way using the YAML::Parser class (which uses the YAML::Scanner class, where the comments themselves are actually "eaten").

like image 800
Doug Barbieri Avatar asked Nov 09 '22 08:11

Doug Barbieri


1 Answers

According to the YAML spec

Comments are a presentation detail and must not have any effect on the serialization tree or representation graph

So you need to make the parser non-compliant to preserve comments, and if yaml-cpp did that, they should clearly state so in the documentation.

I did this for Python in ruamel.yaml. If embedding and calling Python from your C++ program is acceptible you could do something like the following (I used Python 3.5 for this under Linux Mint):

pythonyaml.cpp:

#include <Python.h>

int
update_yaml(const char*yif, const char *yof, const char* obj_path, int val)
{
    PyObject *pName, *pModule, *pFunc;
    PyObject *pArgs, *pValue;
    const char *modname = "update_yaml";
    const char *lus = "load_update_save";

    Py_Initialize();
    // add current directory to search path
    PyObject *sys_path = PySys_GetObject("path");
    PyList_Append(sys_path, PyUnicode_FromString("."));

    pName = PyUnicode_DecodeFSDefault(modname);
    /* Error checking of pName left out */

    pModule = PyImport_Import(pName);
    Py_DECREF(pName);

    if (pModule != NULL) {
        pFunc = PyObject_GetAttrString(pModule, lus);
        /* pFunc is a new reference */

        if (pFunc && PyCallable_Check(pFunc)) {
            pArgs = PyTuple_New(4);
            PyTuple_SetItem(pArgs, 0, PyUnicode_FromString(yif));
            PyTuple_SetItem(pArgs, 1, PyUnicode_FromString(yof));
            PyTuple_SetItem(pArgs, 2, PyUnicode_FromString(obj_path));
            PyTuple_SetItem(pArgs, 3, PyLong_FromLong(val));

            pValue = PyObject_CallObject(pFunc, pArgs);
            Py_DECREF(pArgs);
            if (pValue != NULL) {
                printf("Old value: %ld\n", PyLong_AsLong(pValue));
                Py_DECREF(pValue);
            }
            else {
                Py_DECREF(pFunc);
                Py_DECREF(pModule);
                PyErr_Print();
                fprintf(stderr,"Call failed\n");
                return 1;
            }
        }
        else {
            if (PyErr_Occurred())
                PyErr_Print();
            fprintf(stderr, "Cannot find function \"%s\"\n", lus);
        }
        Py_XDECREF(pFunc);
        Py_DECREF(pModule);
    }
    else {
        PyErr_Print();
        fprintf(stderr, "Failed to load \"%s\"\n", modname);
        return 1;
    }
    Py_Finalize();
    return 0;
}


int
main(int argc, char *argv[])
{
    const char *yaml_in_file = "input.yaml";
    const char *yaml_out_file = "output.yaml";
    update_yaml(yaml_in_file, yaml_out_file, "abc.1.klm", 42);
}

Create a Makefile (adapt the path to your Python3.5 installation, which needs to have the headers installed, as is normal if compiled from source, otherwise you need the package python3-dev installed):

echo -e "SRC:=pythonyaml.cpp\n\ncompile:\n\tgcc \$(SRC) $(/opt/python/3.5/bin/python3-config --cflags --ldflags | tr --delete '\n' | sed 's/-Wstrict-prototypes//') -o pythonyaml"  > Makefile

compile the program with make.

Create update_yaml.py which will be loaded by pythonyaml:

# coding: utf-8

import traceback
import ruamel.yaml


def set_value(data, key_list, value):
    """key list is a set keys to access nested dict and list
    dict keys are assumed to be strings, keys for a list must be convertable to integer
    """
    key = key_list.pop(0)
    if isinstance(data, list):
        key = int(key)
    item = data[key]
    if len(key_list) == 0:
        data[key] = value
        return item
    return set_value(item, key_list, value)


def load_update_save(yaml_in, yaml_out, obj_path, value):
    try:
        if not isinstance(obj_path, list):
            obj_path = obj_path.split('.')
        with open(yaml_in) as fp:
            data = ruamel.yaml.round_trip_load(fp)
        res = set_value(data, obj_path.split('.'), value)
        with open(yaml_out, 'w') as fp:
            ruamel.yaml.round_trip_dump(data, fp)
        return res
    except Exception as e:
        print('Exception', e)
        traceback.print_exc()  # to get some useful feedback if your python has errors

Create input.yaml:

abc:
  - zero-th item of list
  - klm: -999        # the answer?
    xyz: last entry  # another comment

If you have ruamel.yaml installed in your python3.5 and run ./python_yaml it will print Old value: -999, and the new file output.yaml will contain:

abc:
- zero-th item of list
- klm: 42            # the answer?
  xyz: last entry    # another comment
  • although 42 has only two characters where -999 has four, the comment still aligns with the one below it
  • instead of providing a dotted path abc.1.klm you can create a Python list in C++, and hand that to load_update_save() as third parameter. In that case you can have keys that are other items than strings, or keys that are a string that contains a dot
  • depending on your usage you might want to change the hard coded assumption of setting an integer (PyLong_FromLong for the fourth parameter) for the value. The python program doesn't need updating for that.
  • you can use the same file_name for input and output, to overwrite the input.
  • it is possible to change the comment from the python file using ruamel.yaml
like image 185
Anthon Avatar answered Nov 14 '22 23:11

Anthon