I would like to know which one of json.dump()
or json.dumps()
are the most efficient when it comes to encoding a large array to json format.
Can you please show me an example of using json.dump()
?
Actually I am making a Python CGI that gets large amount of data from a MySQL database using the ORM SQlAlchemy, and after some user triggered processing, I store the final output in an Array that I finally convert to Json.
But when converting to JSON with :
print json.dumps({'success': True, 'data': data}) #data is my array
I get the following error:
Traceback (most recent call last):
File "C:/script/cgi/translate_parameters.py", line 617, in <module>
f.write(json.dumps(mytab,default=dthandler,indent=4))
File "C:\Python27\lib\json\__init__.py", line 250, in dumps
sort_keys=sort_keys, **kw).encode(obj)
File "C:\Python27\lib\json\encoder.py", line 209, in encode
chunks = list(chunks)
MemoryError
So, my guess is using json.dump()
to convert data by chunks. Any ideas on how to do this?
Or other ideas besides using json.dump()
?
You can simply replace
f.write(json.dumps(mytab,default=dthandler,indent=4))
by
json.dump(mytab, f, default=dthandler, indent=4)
This should "stream" the data into the file.
The JSON
module will allocate the entire JSON string in memory before writing, which is why MemoryError
occurs.
To get around this problem, use JSON.Encoder().iterencode()
:
with open(filepath, 'w') as f:
for chunk in json.JSONEncoder().iterencode(object_to_encode):
f.write(chunk)
However note that this will generally take quite a while, since it is writing in many small chunks and not everything at once.
Special case:
I had a Python object which is a list of dicts. Like such:
[
{ "prop": 1, "attr": 2 },
{ "prop": 3, "attr": 4 }
# ...
]
I could JSON.dumps()
individual objects, but the dumping whole list generates a MemoryError
To speed up writing, I opened the file and wrote the JSON delimiter manually:
with open(filepath, 'w') as f:
f.write('[')
for obj in list_of_dicts[:-1]:
json.dump(obj, f)
f.write(',')
json.dump(list_of_dicts[-1], f)
f.write(']')
You can probably get away with something like that if you know your JSON object structure beforehand. For a general use, just use JSON.Encoder().iterencode()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With