How can I concat a list of JSON files into a huge JSON array? I've 5000 files and 550 000 list items.
My fist try was to use jq, but it looks like jq -s is not optimized for a large input.
jq -s -r '[.[][]]' *.js
This command works, but takes way too long to complete and I really would like to solve this with Python.
Here is my current code:
def concatFiles(outName, inFileNames): def listGenerator(): for inName in inFileNames: with open(inName, 'r') as f: for item in json.load(f): yield item with open(outName, 'w') as f: json.dump(listGenerator(), f)
I'm getting:
TypeError: <generator object listGenerator at 0x7f94dc2eb3c0> is not JSON serializable
Any attempt load all files into ram will trigger the OOM-killer of Linux. Do you have any ideas?
As of simplejson 3.8.0, you can use the iterable_as_array
option to make any iterable serializable into an array
# Since simplejson is backwards compatible, you should feel free to import # it as `json` import simplejson as json json.dumps((i*i for i in range(10)), iterable_as_array=True)
result is [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With