I have a 1.7 GB JSON file when I am trying to open with json.load()
then it is giving memory error, How could read the JSON file in python?
My JSON file is a big array of objects containing specific keys.
Edit: Well if it is just one big array of objects and it is known the structure of objects beforehand then there is no need to use tools we could read it line by line. A line will just contain one element of the array. I noticed that is the way json files are stored, for me it worked as just:
>>>for line in open('file.json','r').readline():
... do something with(line)
You want an incremental json parser like yajl and one of its python bindings. An incremental parser reads as little as possible from the input and invokes a callback when something meaningful is decoded. For example, to pull only numbers from a big json file:
class ContentHandler(YajlContentHandler):
def yajl_number(self, ctx, val):
list_of_numbers.append(float(val))
parser = YajlParser(ContentHandler())
parser.parse(some_file)
See http://pykler.github.com/yajl-py/ for more info.
I have found another python wrapper around yajl library, which is ijson.
It works better for me than yajl-py due to the following reasons:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With