Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling lazy JSON in Python - 'Expecting property name'

Tags:

python

json

Using Pythons (2.7) 'json' module I'm looking to process various JSON feeds. Unfortunately some of these feeds do not conform with JSON standards - in specific some keys are not wrapped in double speech-marks ("). This is causing Python to bug out.

Before writing an ugly-as-hell piece of code to parse and repair the incoming data, I thought I'd ask - is there any way to allow Python to either parse this malformed JSON or 'repair' the data so that it would be valid JSON?

Working example

import json >>> json.loads('{"key1":1,"key2":2,"key3":3}') {'key3': 3, 'key2': 2, 'key1': 1} 

Broken example

import json >>> json.loads('{key1:1,key2:2,key3:3}') Traceback (most recent call last):   File "<stdin>", line 1, in <module>   File "C:\Python27\lib\json\__init__.py", line 310, in loads     return _default_decoder.decode(s)   File "C:\Python27\lib\json\decoder.py", line 346, in decode     obj, end = self.raw_decode(s, idx=_w(s, 0).end())   File "C:\Python27\lib\json\decoder.py", line 362, in raw_decode     obj, end = self.scan_once(s, idx) ValueError: Expecting property name: line 1 column 1 (char 1) 

I've written a small REGEX to fix the JSON coming from this particular provider, but I forsee this being an issue in the future. Below is what I came up with.

>>> import re >>> s = '{key1:1,key2:2,key3:3}' >>> s = re.sub('([{,])([^{:\s"]*):', lambda m: '%s"%s":'%(m.group(1),m.group(2)),s) >>> s '{"key1":1,"key2":2,"key3":3}' 
like image 687
Seidr Avatar asked Oct 27 '10 13:10

Seidr


2 Answers

You're trying to use a JSON parser to parse something that isn't JSON. Your best bet is to get the creator of the feeds to fix them.

I understand that isn't always possible. You might be able to fix the data using regexes, depending on how broken it is:

j = re.sub(r"{\s*(\w)", r'{"\1', j) j = re.sub(r",\s*(\w)", r',"\1', j) j = re.sub(r"(\w):", r'\1":', j) 
like image 175
Ned Batchelder Avatar answered Oct 11 '22 11:10

Ned Batchelder


Another option is to use the demjson module which can parse json in non-strict mode.

like image 31
Joel Avatar answered Oct 11 '22 12:10

Joel