I'm trying to parse a large (~100MB) json file using ijson package which allows me to interact with the file in an efficient way. However, after writing some code like this,
with open(filename, 'r') as f:
parser = ijson.parse(f)
for prefix, event, value in parser:
if prefix == "name":
print(value)
I found that the code parses only the first line and not the rest of the lines from the file!!
Here is how a portion of my json file looks like:
{"name":"accelerator_pedal_position","value":0,"timestamp":1364323939.012000}
{"name":"engine_speed","value":772,"timestamp":1364323939.027000}
{"name":"vehicle_speed","value":0,"timestamp":1364323939.029000}
{"name":"accelerator_pedal_position","value":0,"timestamp":1364323939.035000}
In my opinion, I think ijson
parses only one json object.
Can someone please suggest how to work around this?
To load big JSON files in a memory efficient and fast way with Python, we can use the ijson library. We call ijson. parse to parse the file opened by open . Then we print the key prefix , data type of the JSON value store in the_type , and the value of the entry with the given key prefix .
If the data doesn't update too frequently, you can even cache it on the frontend. This would at least prevent the user from needing to fetch it repeatedly. Alternatively, you can read the JSON via a stream on the server and stream the data to the client and use something like JSONStream to parse the data on the client.
Since the provided chunk looks more like a set of lines each composing an independent JSON, it should be parsed accordingly:
# each JSON is small, there's no need in iterative processing
import json
with open(filename, 'r') as f:
for line in f:
data = json.loads(line)
# data[u'name'], data[u'engine_speed'], data[u'timestamp'] now
# contain correspoding values
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With