I have a project that use JSON as cross-language serialization to pass around data. Recently the size of the data grows a little huge (10k length list of objects). It takes python standard json library around 20 seconds to serialize the data.
I am working to optimize the time. While switch to other json serializer (cjson, simplejson, ujson) can speed things up quite a bit, I am start to wondering the time complexity of JSON serialization. If the relationship is not linear (say if it is n^2) I can easily chop the data in chunks and reduce the time significantly.
From what I guessed, the complexity should really depends on the input data. But is there a worst-case/average estimation available? A link to reference will be highly appreciated too.
Thanks.
I've benchmarked the time complexity with this code:
import json
import random
import time
Ns = 10, 100, 1000, 10000, 100000, 200000, 300000, 600000, 1000000
for N in Ns:
l = [random.random() for i in xrange(N)]
t0 = time.time()
s = json.dumps(l)
t1 = time.time()
dt = t1-t0
print "%s %s" % (N, dt)
On my machine, the outcome is:
10 7.20024108887e-05
100 0.000385999679565
1000 0.00362801551819
10000 0.036504983902
100000 0.366562128067
200000 0.73614192009
300000 1.09785795212
600000 2.20272803307
1000000 3.6590487957
First column: list length; second column: time for serialization. Plotting (with for example xmgrace) reveals an ideal linear relation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With