What's the best way to load large JSON lists in Python? [duplicate]

Question

I have access to a set of files (around 80-800mb each). Unfortunately, there's only one line in every file. The line contains exactly one JSON object (a list of lists). What's the best way to load and parse it into smaller JSON objects?

VinceP · Accepted Answer

The module pandas 0.21.0 now supports chunksize as part of read_json. You can load and manipulate one chunk at a time:

import pandas as pd
chunks = pd.read_json(file, lines=True, chunksize = 100)
for c in chunks:
    print(c)

Charles Menguy · Answer

There is already a similar post here. Here is the solution they proposed:

import json
with open('file.json') as infile:
  o = json.load(infile)
  chunkSize = 1000
  for i in xrange(0, len(o), chunkSize):
    with open('file_' + str(i//chunkSize) + '.json', 'w') as outfile:
      json.dump(o[i:i+chunkSize], outfile)

What's the best way to load large JSON lists in Python? [duplicate]

Tags:

python

json

large-files

Sam Odio

2 Answers

VinceP

Charles Menguy

Recent Activity

Donate For Us

What's the best way to load large JSON lists in Python? [duplicate]

Tags:

python

json

large-files

Sam Odio

2 Answers

VinceP

Charles Menguy

Related questions

Recent Activity

Donate For Us