I have 10,000's of json objects in a json file in following format :
{ "a": 1,
"b" : 2,
"c" : {
"d":3
}
}{ "e" : 4,
"f" : 5,
"g" : {
"h":6
}
}
How can I load these as a json object?
Two methods that I've tried with corresponding error :
Method 1 :
>>> with open('test1.json') as jsonfile:
... for line in jsonfile:
... data = json.loads(line)
...
Error :
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "/usr/lib/python3.5/json/__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.5/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.5/json/decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 1 (char 10)
Method 2 :
>>> with open('test1.json') as jsonfile:
... data = json.load(jsonfile)
...
Error :
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/lib/python3.5/json/__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.5/json/__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.5/json/decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 7 column 1 (char 46)
>>>
I've read the related questions but none of them helped.
The content of file you described is not a valid JSON object this is why bot approaches are not working.
To transform in something you can load with json.load(fd)
you have to:
[
at the beginning of the file,
between each object]
at the very end of the filethen you can use the Method 2. For instance:
[ { "a": 1,
"b" : 2,
"c" : {
"d":3
}
}, { "e" : 4,
"f" : 5,
"g" : {
"h":6
}
}
]
is a valid JSON array
If the file format is exactly as you've described you could do
with open(filename, 'r') as infile:
data = infile.read()
new_data = data.replace('}{', '},{')
json_data = json.loads(f'[{new_data}]')
I believe that the best approach if you don't want to change the source file would be to use json.JSONDecoder.raw_decode() It would allow you to iterate through each valid json object you have in the file
from json import JSONDecoder, JSONDecodeError
decoder = JSONDecoder()
content = '{ "a": 1, "b": 2, "c": { "d":3 }}{ "e": 4, "f": 5, "g": {"h":6 } }'
pos = 0
while True:
try:
o, pos = decoder.raw_decode(content, pos)
print(o)
except JSONDecodeError:
break
Would print your two Json objects
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With