I am trying to compare two files and each line is in JSON format. I need to compare each line between two files and should return the difference.Since the file size is too big and I am unable to read and compare each line.Please suggest me some optimised way in doing this.
Comparing Json: Comparing json is quite simple, we can use '==' operator, Note: '==' and 'is' operator are not same, '==' operator is use to check equality of values , whereas 'is' operator is used to check reference equality, hence one should use '==' operator, 'is' operator will not give expected result.
This means that it is possible to compare complete JSON trees for equality by comparing equality of root nodes. mapper. readTree do not works if json fields has different order. And only compares the structure of the json documents.
To compare two JSON objects with the same elements in a different order equal with Python, we can load the JSON strings into dicts with json. loads . Then we can sort the items with sorted and then compare them. to load the JSON strings into dicts with json.
First, we will read JSON data of two JSON files, sort the data in ascending order by keys and finally use the equal == operator to compare the sorted JSON data of the two files.
Let's read the input JSON as JsonNode and compare: Again, we should notice that equals () can also compare two input JSON objects with nested elements. 3.3. Compare Two JSON Objects Containing a List Element
JSON Compare has the functionality to find different with JSON APIs, JSON Files and JSON Data. You can also beautify JSON or formate JSON. You can also download your JSON Data as a JSON file. We provide you to Directly copy JSON Data and paste when you want.
JSON Diff helps to Compare and find diff in JSON data. It also provides different view which helps to find different in your JSON data. It helps to Compare and find proper different in JSON Code, JSON files. It's also a JSON Beautify your compare Data. You can also download your JSON Data.
Two possible ways :
Given that you have a large file, you are better off using difflib technique described in point 1.
Edit based on response to my below answer:
After some research, it appears that the best way to deal with large data payloads is to process this payload in a streamed manner. This way we ensure a speedy processing of the data keeping in mind the memory usage and performance of the software in general.
Refer to this link that talks about Streaming JSON data objects using Python. Similarly take a look at ijson - this is an iterator based JSON parsing/processing library in python.
Hopefully, this helps you towards identifying a good fit library that will solve your use case
This seems to be a pretty solid start: https://github.com/ZoomerAnalytics/jsondiff
>>> pip install jsondiff
>>> from jsondiff import diff
>>> diff({'a': 1, 'b': 2}, {'b': 3, 'c': 4}, syntax='symmetric')
{insert: {'c': 4}, 'b': [2, 3], delete: {'a': 1}}
I'm also going to try it out for a current project, I'll try to maintain updates and edits as I go along.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With