Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Round-trip parsing of data structure format (YAML or whatnot) preserving comments, for writing configuration

Tags:

json

xml

yaml

perl

ini

I have been using YAML as configuration file format in several applications, and all went well except one thing: when my program needs to write/modify a config variable in a YAML config file, it destroys formatting and comments by loading and dumping the entire file/structure.

(Well, there is another problem with YAML actually. Most users, many of them are not programmers, will be tripped over the details of YAML rules, like the significance of whitespace in some places. But this is not a major gripe.)

What I would prefer is a YAML loader/dumper which can do round-trip parsing (preserving all whitespaces & comments), or some other human-readable serialization format which has such parser. I'm even considering using Perl document and PPI, since PPI is a round-trip safe parser. Or perhaps PPI can be bent to deal with YAML or similar formats? I'd rather not use XML, I'd resort to INI+(JSON|YAML|... for key values) before that.

Any advice or pointers?

like image 979
Steven Haryanto Avatar asked Aug 04 '11 11:08

Steven Haryanto


People also ask

What are the data types in YAML file?

YAML file consists of the following data types Scalars: Scalars are values like Strings, Integers, Booleans, etc. Sequences: Sequences are lists with each item starting with a hyphen (-). Lists can also be nested. Mappings: Mapping gives the ability to list keys with values.

What is the synopsis of the basic elements of YAML?

Synopsis of YAML Basic Elements. The synopsis of YAML basic elements is given here: Comments in YAML begins with the (#) character. Comments must be separated from other tokens by whitespaces. Indentation of whitespace is used to denote structure. Tabs are not included as indentation for YAML files.

How do you indent comments in a YAML file?

Comments must be separated from other tokens by whitespaces. Indentation of whitespace is used to denote structure. Tabs are not included as indentation for YAML files. List members are denoted by a leading hyphen ( - ). List members are enclosed in square brackets and separated by commas.

What is the difference between YAML and JSON?

Directives begin with a percent sign (%) followed by the name and then the parameters separated by spaces. Basically, both JSON and YAML are developed to provide a human-readable data interchange format. The YAML is realized as a superset of JSON format.


2 Answers

If you are using block structured YAML and Python is acceptable, you can use the Python package¹ ruamel.yaml which is a derivative of PyYAML and supports round trip preservation of comments:

import sys
import ruamel.yaml

inp = """\
# example
name:
  # details
  family: Smith   # very common
  given: Alice    # one of the siblings
"""

yaml = ruamel.yaml.YAML()

code = yaml.load(inp)
code['name']['given'] = 'Bob'

yaml.dump(code, sys.stdout)

with result:

# example
name:
  # details
  family: Smith   # very common
  given: Bob      # one of the siblings

Note that the end-of-line comments are still aligned.

Instead of normal list and dict objects the code consists of wrapped versions² on which the comments attached.

¹ Install with pip install ruamel.yaml. Works on Python 2.6/2.7/3.3+. Disclaimer: I am the author of that package.
² ordereddict is used in case of a mapping, to preserve ordering

like image 56
Anthon Avatar answered Nov 05 '22 19:11

Anthon


Yeah, you and everyone who thought wow, yaml sounds cool, simply put, it doesn't exist, yet

update: you probably want to use Config::General, its apache config format (xmlish)

No, PPI is not general purpose tool, if you want BNF-ness, you want to use Marpa

Of all INI/JSON/YAML/XML, XML probably has the best editor support for non-programmers (sounds crazy)

like image 31
wantpretty Avatar answered Nov 05 '22 20:11

wantpretty