YAML : Use mapped list vs array

Question

I am creating a configuration file for my application. To do it, I decided to use YAML for its simplicity and reliability.

I am currently designing a special part of my application: In this part, I have to list and configure all datasets I want to use in a module. To do that I wrote this :

    // Other stuff       
    datasets:
        rate_variation:
            name: Rate variation over time # Optional
            description: Description here # Optional
            type: POINTS_2D
            options:
                REFRESH_TIME: 5 # Time of refresh in second
        frequency_variation:
            name: Frequency variation over time
            description: Description here # Optional
            type: POINTS_2D

But, after some reflection, I have some doubts about it. Because maybe something like this is better :

    datasets:
        -   id: rate_variation
            name: Rate variation over time # Optional
            description: Description here # Optional
            type: POINTS_2D
            options:
                REFRESH_TIME: 5 # Time of refresh in second
        -   id: frequency_variation
            name: Frequency variation over time
            description: Description here # Optional
            type: POINTS_2D

I use the ID to identify each dataset in my scripts (two datasets must have a different id) and generate output files for each of them. But now, I really don't know what is the best solution...

What would you recommend to use? And for what reason?

dreftymac · Accepted Answer

Quick Answer (TL;DR)

YAML can be normalized quite cleanly and in a straightforward manner using YAML ddconfig format
Using this approach can simplify construction and maintenance of configuration files, and make them highly flexible for later use by many types of consuming applications.

Detailed Answer

Context

Data normalization (aka YAML schema definition) with YAML ddconfig format
- (tag:[email protected],2017:ddconfig)
- dmid://uu773yamldata1620421509

Problem

Scenario: Developer graille_stentiplub is creating a configuration file format for use with YAML.
- the data structure (i.e., schema) for the YAML must be flexible for use in many contexts.
- the schema should be amenable to arbitrary and flexible queries where the structure of the YAML does not "get in the way".
- the schema should be easy to read and understand by humans.
- the schema should be easily manipulated by any programming environment capable of processing standard YAML.
Special considerations: graille_stentiplub wants an easy way to determine when to use lists vs mappings.

Example

the following is a simple config file using YAML ddconfig format

  dataroot:

      file_metadata_str: |
        ### <beg-block>
        ### - caption: "my first project"
        ###   notes:  |
        ###     * href="//home/sm/docs/workup/my_first_project.txt"
        ### <end-block>

      project_info:
        prj_name_nice:        StackOverflow Demo Answer Project
        prj_name_mach:        stackoverflow_demo_001a
        prj_sponsor_url:      https://stackoverflow.com/questions/54349286
        prj_dept_url:         https://demo-university.edu/dept/basketweaving

      dataset_recipient_list:
        - [email protected]
        - [email protected]
        - [email protected]

      dataset_variations_table:
          -   dvar_id:            rate_variation
              dvar_name:          Rate variation over time      # Optional
              dvar_description:   Description here              # Optional
              dvar_type:          POINTS_2D
              dvar_opt_refresh_per_second: 5                    # Time in seconds

          -   dvar_id:            frequency_variation
              dvar_name:          Frequency variation over time
              dvar_description:   Description here              # Optional
              dvar_type:          POINTS_2D

Explanation

The entire data structure is nested under a toplevel key called dataroot (this is optional).
- Inclusion of the dataroot key makes the YAML structure more addressible but is not necessary.
- Using a filesystem analogy, you can think of dataroot as a root-level directory.
- Using an XML analogy, you can think of this as the root-level XML tag.
The entire data structure consists of a YAML mapping (aka dictionay) (aka associative-array).
- every mapping key is a first-level child of dataroot (or else a toplevel key if dataroot is omitted).
There are different types of mapping keys:
- String: (suffix _str) indicates that the mapped value is a string (aka scalar) value.
- List: (suffix _list) indicates the mapped value is a list (aka sequence).
- Info: (suffix _info) indicates the mapped value is mapping (aka dictionary) (aka associative-array).
- Table: (suffix _table) indicates the mapped value is a sequence-of-mappings (aka table).
- Tree: (suffix _tree or _struct) indicates a composite structure with support for one or more nested parent-child relationships.

Rationale

The YAML ddconfig format coincides nicely with many different contexts and tools.
This allows for simplified decision making when laying out the configuration file format, as well as simplified programming when parsing the file.

Simplicity

a _list mapping consists of a sequence of scalar-value items with no nesting.
a _info mapping consists of a scalar-key and a scalar-value (name-value pairs) with no nesting.
a _table mapping is simply a sequence of _info mappings.
nesting of arbitrary depth can be accomplished through YAML anchors and aliases, thus supporting the _tree composite data structure.

Similarity to relational databases

You can think of a ddconfig _info mapping as a single record from a standard table in a relational database.
You can think of a ddconfig _table mapping as a standard table in a relational database.
This similarity makes it extremely straightforward to transmit YAML to a database if and where necessary.

Anchors and aliases

The YAML ddconfig format works well with YAML anchors and aliases.
One or more _info mappings can be easily converted to a _table mapping by way of aliases.
Multiple _info mappings can be combined together into another _info mapping by way of YAML merge keys.

YAML : Use mapped list vs array

Tags:

data-structures

yaml

configuration

configuration-files

graille

1 Answers

Quick Answer (TL;DR)

Detailed Answer

Context

Problem

Example

Explanation

Rationale

Simplicity

Similarity to relational databases

Anchors and aliases

See also

dreftymac

Recent Activity

Donate For Us

YAML : Use mapped list vs array

Tags:

data-structures

yaml

configuration

configuration-files

graille

1 Answers

Quick Answer (TL;DR)

Detailed Answer

Context

Problem

Example

Explanation

Rationale

Simplicity

Similarity to relational databases

Anchors and aliases

See also

dreftymac

Related questions

Recent Activity

Donate For Us