Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

YAML as a JSON superset and TAB characters

I am unable to find a reference to this error exactly, but YAML 1.2 says it's a JSON superset, and if I use tab characters in a JSON it treats it like an error.

e.g.

"root": {
        "key": "value"
}

(Online validation here says that '\t' that cannot start any token)

I know why YAML historically disallows tabs, but how can I interpret this in the context of JSON-superset?

(e.g. Is YAML not an actual superset or does JSON also disallow tabs? Or the spec does allow for tabs in this case but the implementation is not there yet?)

Thanks.

like image 321
Vlagged Avatar asked Sep 22 '14 12:09

Vlagged


People also ask

Does YAML support tab characters?

Why does YAML forbid tabs? Tabs have been outlawed since they are treated differently by different editors and tools. And since indentation is so critical to proper interpretation of YAML, this issue is just too tricky to even attempt.

Is YAML a superset of JSON?

Although YAML looks different to JSON, YAML is a superset of JSON. As a superset of JSON, a valid YAML file can contain JSON. Additionally, JSON can transform into YAML as well. YAML itself can also contain JSON in its configuration files.

Is YAML a tab or spaces?

YAML uses spaces, period. Do not use tabs in your SLS files! If strange errors are coming up in rendering SLS files, make sure to check that no tabs have crept in!

What is tab character in YAML?

YAML recognizes two white space characters: space and tab. The following examples will use · to denote spaces and → to denote tabs. All examples can be validated using the official YAML Reference Parser. YAML has a block style and flow style. In block style, indentation determines the structure of a document.


2 Answers

Tabs ARE allowed in YAML, but only where indentation does not apply.

According to YAML 1.2 Section 5.5:

YAML recognizes two white space characters: space and tab.

The following examples will use · to denote spaces and to denote tabs. All examples can be validated using the official YAML Reference Parser.

YAML has a block style and flow style. In block style, indentation determines the structure of a document. The following document uses block style.

root:
··key: value

Validate

In flow style, special characters indicate the structure of the document. The following equivalent document uses flow style.

{
→ root: {
→ → key: value
→ }
}

Validate

You can even mix indentation in flow style.

{
→ root: {
··→ key: value
····}
}

Validate

If you're mixing block and flow style, the entire flow style part must respect the block style indentation.

root:
··{
····key: value
··}

Validate

But you can still mix your indentation within the flow style part.

root:
··{
··→ key: value
··}

Validate

If you have a single value document, you can surround the value with all manner of whitespace.

→ ··value··→ 

Validate

The point is, every JSON document that is parsed as YAML will put the document into flow style (because of the initial { or [ character) which supports tabs, unless it is a single value JSON document, in which case YAML still allows padding with whitespace.

If a YAML parser throws because of tabs in a JSON document, then it is not a valid parser.

That being said, your example is failing because a block style mapping value must always be indented if it's not on the same line as the mapping name.

root: {
··key: value
}

is not valid, however

root:
··{
····key: value
··}

is valid, and

root: { key: value }

is also valid.

like image 144
jordanbtucker Avatar answered Oct 15 '22 14:10

jordanbtucker


I know why YAML historically disallows tabs, but how can I interpret this in the context of JSON-superset?

Taking the rest of the specifications into account, we can only conclude that the "superset" comment is inaccurate. The YAML specification is fundamentally inconsistent in the Relation to JSON section:

YAML can therefore be viewed as a natural superset of JSON, offering improved human readability and a more complete information model. This is also the case in practice; every JSON file is also a valid YAML file. This makes it easy to migrate from JSON to YAML if/when the additional features are required.

JSON's RFC4627 requires that mappings keys merely “SHOULD” be unique, while YAML insists they “MUST” be. Technically, YAML therefore complies with the JSON spec, choosing to treat duplicates as an error. In practice, since JSON is silent on the semantics of such duplicates, the only portable JSON files are those with unique keys, which are therefore valid YAML files.

Despite asserting YAML as a "natural superset of JSON" and stating that "every JSON file is also a valid YAML file", the spec immediately notes some differences regarding key uniqueness. Arguably, the spec should also note the differences around using tabs for indentation here as well.

Speaking of which, as the validator implied, YAML explicitly prohibits tabs as indentation characters:

To maintain portability, tab characters must not be used in indentation, since different systems treat tabs differently. Note that most modern editors may be configured so that pressing the tab key results in the insertion of an appropriate number of spaces.

This is, of course, stricter than the JSON specification, which simply states:

Whitespace can be inserted between any pair of tokens.

So, to directly answer your questions...

(e.g. Is YAML not an actual superset or does JSON also disallow tabs? Or the spec does allow for tabs in this case but the implementation is not there yet?)

...YAML is not actually a superset, JSON does not disallow tabs, whereas the YAML specification does indeed disallow tabs explicitly.

like image 37
jmar777 Avatar answered Oct 15 '22 13:10

jmar777