Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lenient JSON Parser for Python

Tags:

python

json

Is there a "lenient" JSON Parser for Python?

I keep getting (handwritten) JSON files such as this:

/* This JSON file is created by someone who does not know JSON
   And not competent enough to search about "JSON Validators" */

{

  /* Hey look!
     A honkin' block comment here!
     Yeehaw */

  "key1": "value1",  // Hey look there's a standard-breaking comment here!
  "key3": .65,       // I'm too lazy to type "0"
  "key4": -.75,      // That "other" .Net program works anyways...
  "key5": [ 1 /* One */, 2 /* Two */, 3 /* Three */, 4 /* Four */],
  "key2": "value2",  // Whoopsie, forgot to delete the comma here...
}

The program that actually consumed those monstrously malformed JSON files somehow doesn't puke on those errors. That program is written using C#, by the way.

I'm writing some scripts in Python that will perform things based on those JSON files, but it keeps crashing (correctly) on those mistakes.

I can manually edit those .json files to be standard-compliant... but there are a LOT of them and thus it's too effort-intensive -- not to mention that I will have to keep editing new incoming JSON files, urgh.

So, back to my question, is there a lenient JSON parser that can consume those malformed JSON files without dying?

Note: This question concerns only trailing comma of last object; it does NOT handle block-comments and/or inline comments.


Edit: What the... I just received a JSON file in which the creator decided to remove leading zero for 0 < numbers < 1 ... -_-

And I discovered a file where the comment is embedded... :fuming_red:

I'll update the example above to reflect my additional "findings"...

like image 771
pepoluan Avatar asked Jun 21 '19 06:06

pepoluan


1 Answers

Okay, so @warl0ck's comment made me think that I might be better off writing my own "JSON Preprocessor" to do the heavy-duty cleanup.

So, here it is in my BitBucket Snippet, complete with a simple unit test.

I've tested it with my corpus of human-generated malformed JSON files, and it seems to work well so far...

Let me know if there's a bug in there code.

But for the time being, I'm content.

like image 143
pepoluan Avatar answered Oct 22 '22 15:10

pepoluan