Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most diffable data interchange format?

JSON is great because it has wide support, and it's easy for both machines and humans to read and write.

YAML is great because it's even easier for humans to read and write, and it has support for more data types.

TOML is like an improved version of INI.

I want to optimize for something different: diffability. i.e., how easy is it to understand what changed between two versions of the same document when ran through a standard diff tool?

As far as I can tell, Yarn went so far as to create their own custom format for their lock files just to improve this aspect.

Are there any open-source JS libraries for producing diffable output from an object?

like image 351
mpen Avatar asked Mar 18 '26 11:03

mpen


1 Answers

Canonicalized then prettified JSON

Canonicalization normalizes the type serialization and sorts the fields.

Prettifying adds back white space and line separators.

We need to come up with a standard for prettify.

I would like to see a YAML equivalent of this diffability. Maybe that is as simple as just converting from YAML to JSONC then converting the canonicalized JSONC back to YAML. The JSONC to YAML conversion process will also need to be standardized. A JSONC canonicalizer might not exist yet. Definitely not this simple.

Note: Prettifying makes it no longer canonical, but is necessary for diffability.

The RFC offers a sample ES6 JSON canonicalizer.

The following Open Source implementations have been verified to be compatible with JCS:

  • JavaScript: https://www.npmjs.com/package/canonicalize
  • Java: https://github.com/erdtman/java-json-canonicalization
  • Go: https://github.com/cyberphone/json-canonicalization/tree/master/go
  • .NET/C#: https://github.com/cyberphone/json-canonicalization/tree/master/dotnet
  • Python: https://github.com/cyberphone/json-canonicalization/tree/master/python3

— Open Source Implementations

Canonicalize

Raw

  {
    "numbers": [333333333.33333329, 1E30, 4.50,
                2e-3, 0.000000000000000000000000001],
    "string": "\u20ac$\u000F\u000aA'\u0042\u0022\u005c\\\"\/",
    "literals": [null, true, false]
  }

Remove whitespace and normalize serialization

{"numbers":[333333333.3333333,1e+30,4.5,0.002,1e-27],"string":"EURO$\u000f\nA'B\"\\\\\"/","literals":[null,true,false]}

Sort

{"literals":[null,true,false],"numbers":[333333333.3333333,1e+30,4.5,0.002,1e-27],"string":"EURO$\u000f\nA'B\"\\\\\"/"}

Prettify

{
  "literals": [
    null,
    true,
    false
  ],
  "numbers": [
    333333333.3333333,
    1e+30,
    4.5,
    0.002,
    1e-27
  ],
  "string": "EURO$\u000f\nA'B\"\\\\\"/"
}
like image 134
Gabriel Avatar answered Mar 20 '26 15:03

Gabriel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!