How to read JSON file with multiple JSON objects in Python?

Question

So here is the standard way to read in a JSON file in python

import json
from pprint import pprint

with open('ig001.json') as data_file:    
    data = json.load(data_file)

pprint(data)

However, my JSON file that I want to read has multiple JSON objects in it. So it looks something like:

[{},{}.... ]

Where this represents 2 JSON objects, and inside each object inside each {}, there are a bunch of key:value pairs.

So when I try to read this using the standard read code that I have above, I get the error:

Traceback (most recent call last): File "jsonformatter.py", line 5, in data = json.load(data_file) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/init.py", line 290, in load **kw) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/init.py", line 338, in loads return _default_decoder.decode(s) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 369, in decode raise ValueError(errmsg("Extra data", s, end, len(s))) ValueError: Extra data: line 3889 column 2 - line 719307 column 2 (char 164691 - 30776399)

Where line 3889 is where the first JSON object ends and the next one begins, the line itself looks like "][".

Any ideas on how to fix this would be appreciated, thanks!

magni- · Accepted Answer

Without a link your JSON file, I'm going to have to make some assumptions:

Top-level json arrays are not each on their own line (since the first parsing error is on line 3889), so we can't easily
This is the only type of invalid JSON present in the file.

To fix this:

# 1. replace instances of `][` with `]<SPLIT>[`
# (`<SPLIT>` needs to be something that is not present anywhere in the file to begin with)

raw_data = data_file.read()  # we're going to need the entire file in memory
tweaked_data = raw_data.replace('][', ']<SPLIT>[')

# 2. split the string into an array of strings, using the chosen split indicator

split_data = tweaked_data.split('<SPLIT>')

# 3. load each string individually

parsed_data = [json.loads(bit_of_data) for bit_of_data in split_data]

(pardon the horrible variable names)

How to read JSON file with multiple JSON objects in Python?

Tags:

python

json

john2131

1 Answers

magni-

Recent Activity

Donate For Us

How to read JSON file with multiple JSON objects in Python?

Tags:

python

json

john2131

1 Answers

magni-

Related questions

Recent Activity

Donate For Us