Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JSON diff of large JSON data, finding some JSON as a subset of another JSON

I have a problem I'd like to solve to not have to spend a lot of manual work to analyze as an alternative.

I have 2 JSON objects (returned from different web service API or HTTP responses). There is intersecting data between the 2 JSON objects, and they share similar JSON structure, but not identical. One JSON (the smaller one) is like a subset of the bigger JSON object.

I want to find all the interesecting data between the two objects. Actually, I'm more interested in the shared parameters/properties within the object, not really the actual values of the parameters/properties of each object. Because I want to eventually use data from one JSON output to construct the other JSON as input to an API call. Unfortunately, I don't have the documentation that defines the JSON for each API. :(

What makes this tougher is the JSON objects are huge. One spans a page if you print it out via Windows Notepad. The other spans 37 pages. The APIs return the JSON output compressed as a single line. Normal text compare doesn't do much, I'd have to reformat manually or w/ script to break up object w/ newlines, etc. for a text compare to work well. Tried with Beyond Compare tool.

I could do manual search/grep but that's a pain to cycle through all the parameters inside the smaller JSON. Could write code to do it but I'd also have to spend time to do that, and test if the code works also. Or maybe there's some ready made code already for that...

Or can look for JSON diff type tools. Searched for some. Came across these:

https://github.com/samsonjs/json-diff or https://tlrobinson.net/projects/javascript-fun/jsondiff

https://github.com/andreyvit/json-diff

both failed to do what I wanted. Presumably the JSON is either too complex or too large to process.

Any thoughts on best solution? Or might the best solution for now be manual analysis w/ grep for each parameter/property?

In terms of a code solution, any language will do. I just need a parser or diff tool that will do what I want.

Sorry, can't share the JSON data structure with you either, it may be considered confidential.

like image 268
David Avatar asked Oct 09 '12 01:10

David


People also ask

How do I compare large JSON files?

Copy and paste, drag and drop a JSON file or directly type in the editors above, and they will be automatically compared if the two JSON are valids. You can also click on "load JSON from URL" button to load your JSON data from a URL (Must be https).

What is JSON diff?

A diff takes two JSON objects and presents any differences between them. Diff has several uses. A key use is displaying a clear summary of differences between large objects, enhancing the visibility of changes. This enables manual, user-interface assisted, or client actions to resolve differences.


2 Answers

Beyond Compare works well, if you set up a JSON file format in it to use Python to pretty-print the JSON. Sample setup for Windows:

  1. Install Python 2.7.
  2. In Beyond Compare, go under Tools, under File Formats.
  3. Click New. Choose Text Format. Enter "JSON" as a name.
  4. Under the General tab:
    • Mask: *.json
  5. Under the Conversion tab:
    • Conversion: External program (Unicode filenames)
    • Loading: c:\Python27\python.exe -m json.tool %s %t
      • Note, that second parameter in the command line must be %t, if you enter two %ss you will suffer data loss.
  6. Click Save.
like image 200
Josh Kelley Avatar answered Nov 10 '22 07:11

Josh Kelley


Jeremy Simmons has created a better File Format package Posted on forum: "JsonFileFormat.bcpkg" for BEYOND COMPARE that does not require python or so to be installed.

Just download the file and open it with BC and you are good to go. So, its much more simpler.

JSON File Format

I needed a file format for JSON files.

I wanted to pretty-print & sort my JSON to make comparison easy.

I have attached my bcpackage with my completed JSON File Format.

The formatting is done via jq - http://stedolan.github.io/jq/

Props to Stephen Dolan for the utility https://github.com/stedolan.

I have sent a message to the folks at Scooter Software asking them to include it in the page with additional formats.

If you're interested in seeing it on there, I'm sure a quick reply to the thread with an up-vote would help them see the value posting it. Attached Files Attached Files File Type: bcpkg JsonFileFormat.bcpkg (449.8 KB, 58 views)

like image 20
Alex S Avatar answered Nov 10 '22 08:11

Alex S