Unable to load a previously dumped pickle file of large size in Python

Tags:

I used cPickle and protocol version 2 to dump some computation results. The code looks like this:

> f = open('foo.pck', 'w')
> cPickle.dump(var, f, protocol=2)
> f.close()

The variable var is a tuple of length two. The type of var[0] is a list and var[1] is a numpy.ndarray.

The above code segment successfully generated a file with large size (~1.7G).

However, when I tried to load the variable from foo.pck, I got the following error.

ValueError                                Traceback (most recent call last)
/home/user_account/tmp/<ipython-input-3-fd3ecce18dcd> in <module>()
----> 1 v = cPickle.load(f)
ValueError: buffer size does not match array size

The loading codes looks like the following.

> f= open('foo.pck', 'r')
> v = cPickle.load(f)

I also tried to use pickle (instead of cPickle) to load the variable, but got a similar error msg as follows.

ValueError                                Traceback (most recent call last)
/home/user_account/tmp/<ipython-input-3-aa6586c8e4bf> in <module>()
----> 1 v = pickle.load(f)

/usr/lib64/python2.6/pickle.pyc in load(file)
   1368 
   1369 def load(file):
-> 1370     return Unpickler(file).load()
   1371 
   1372 def loads(str):

/usr/lib64/python2.6/pickle.pyc in load(self)
    856             while 1:
    857                 key = read(1)
--> 858                 dispatch[key](self)
    859         except _Stop, stopinst:
    860             return stopinst.value

/usr/lib64/python2.6/pickle.pyc in load_build(self)
   1215         setstate = getattr(inst, "__setstate__", None)
   1216         if setstate:
-> 1217             setstate(state)
   1218             return
   1219         slotstate = None

ValueError: buffer size does not match array size

I tried the same code segments to a much smaller size data and it worked fine. So my best guess is that I reached the loading size limitation of pickle (or cPickle). However, it is strange to dump successfully (with large size variable) but failed to load.

If this is indeed a loading size limitation problem, how should I bypass it? If not, what can be the possible cause of the problem?

Any suggestion is appreciated. Thanks!

715

asked Aug 21 '12 18:08

user1036719

1 Answers

How about save & load the numpy array by numpy.save() & np.load()?

You can save the pickled list and the numpy array to the same file:

import numpy as np
import cPickle
data = np.random.rand(50000000)
f = open('foo.pck', 'wb')
cPickle.dump([1,2,3], f, protocol=2)
np.save(f, data)
f.close()

to read the data:

import cPickle
import numpy as np
f= open('foo.pck', 'rb')
v = cPickle.load(f)
data = np.load(f)
print data.shape, data

154

answered Oct 15 '22 21:10

HYRY

Related questions
                            
                                How can I fix Vim's line breaking behavior for long lines in Python?
                            
                                python GUI compared to Swing?
                            
                                How to catch error 1062 "duplicate entry" independent from used database/engine?
                            
                                Browserless access to LinkedIn with Python
                            
                                Python's urllib2.urlopen() hanging with local connection to a Java Restlet server
                            
                                Preventing imports in Python
                            
                                Avoiding multiple references to the same object in Django ORM
                            
                                Utility for releasing packages to PyPi?
                            
                                Matplotlib, legend with multiple different markers with one label
                            
                                Foolproof cross-platform process kill daemon
                            
                                Role-based authorization mechanism for a GAE app
                            
                                Django unique=True except for blank values
                            
                                Automatic XSD validation
                            
                                python argparse to handle arbitrary numeric options (like HEAD(1))
                            
                                wxPython 2.9 on Mac Os X
                            
                                How to use Qxt libraries on PyQt?
                            
                                Recommended way to generate XHTML documents with lxml
                            
                                Paramiko SFTP with key and username/password - " Oops, unhandled type 3" [duplicate]
                            
                                How to speed up matplotlib when plotting and saving lots of figures?
                            
                                tkinter scale with two sliders?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unable to load a previously dumped pickle file of large size in Python

Tags:

python

numpy

pickle

networkx

user1036719

People also ask

1 Answers

HYRY

Recent Activity

Donate For Us