Fastest way to store large files in Python

Question

I recently asked a question regarding how to save large python objects to file. I had previously run into problems converting massive Python dictionaries into string and writing them to file via write(). Now I am using pickle. Although it works, the files are incredibly large (> 5 GB). I have little experience in the field of such large files. I wanted to know if it would be faster, or even possible, to zip this pickle file prior to storing it to memory.

cJ Zougloub · Accepted Answer

Python code would be extremely slow when it comes to implementing data serialization. If you try to create an equivalent to Pickle in pure Python, you'll see that it will be super slow. Fortunately the built-in modules which perform that are quite good.

Apart from cPickle, you will find the marshal module, which is a lot faster. But it needs a real file handle (not from a file-like object). You can import marshal as Pickle and see the difference. I don't think you can make a custom serializer which is a lot faster than this...

Here's an actual (not so old) serious benchmark of Python serializers

phihag · Answer

You can compress the data with bzip2:

from __future__ import with_statement # Only for Python 2.5
import bz2,json,contextlib

hugeData = {'key': {'x': 1, 'y':2}}
with contextlib.closing(bz2.BZ2File('data.json.bz2', 'wb')) as f:
  json.dump(hugeData, f)

Load it like this:

from __future__ import with_statement # Only for Python 2.5
import bz2,json,contextlib

with contextlib.closing(bz2.BZ2File('data.json.bz2', 'rb')) as f:
  hugeData = json.load(f)

You can also compress the data using zlib or gzip with pretty much the same interface. However, both zlib and gzip's compression rates will be lower than the one achieved with bzip2 (or lzma).

Fastest way to store large files in Python

Tags:

python

compression

pickle

puk

2 Answers

cJ Zougloub

phihag

Recent Activity

Donate For Us

Fastest way to store large files in Python

Tags:

python

compression

pickle

puk

2 Answers

cJ Zougloub

phihag

Related questions

Recent Activity

Donate For Us