I am new to Python. I am adapting someone else's code from Python 2.X to 3.5. The code loads a file via cPickle. I changed all "cPickle" occurrences to "pickle" as I understand pickle superceded cPickle in 3.5. I get this execution error: <pre class="prettyprint"><code>NameError: name 'cPickle' is not defined </code></pre> Pertinent code: <pre class="prettyprint"><code>import pickle import gzip ... def load_data(): f = gzip.open('../data/mnist.pkl.gz', 'rb') training_data, validation_data, test_data = pickle.load(f, fix_imports=True) f.close() return (training_data, validation_data, test_data) </code></pre> The error occurs in the <code>pickle.load</code> line when <code>load_data()</code> is called by another function. However, a) neither <code>cPickle</code> or <code>cpickle</code> no longer appear in any source files anywhere in the project (searched globally) and b) the error does not occur if I run the lines within <code>load_data()</code> individually in the Python shell (however, I do get another data format error). Is <code>pickle</code> calling <code>cPickle</code>, and if so how do I stop it? Shell: Python 3.5.0 |Anaconda 2.4.0 (x86_64)| (default, Oct 20 2015, 14:39:26) [GCC 4.2.1 (Apple Inc. build 5577)] on darwin IDE: IntelliJ 15.0.1, Python 3.5.0, anaconda Unclear how to proceed. Any help appreciated. Thanks.

Actually, if you have pickled objects from <code>python2.x</code>, then generally can be read by <code>python3.x</code>. Also, if you have pickled objects from <code>python3.x</code>, they generally can be read by <code>python2.x</code>, but only if they were dumped with a <code>protocol</code> set to <code>2</code> or less. <pre class="prettyprint"><code>Python 2.7.10 (default, Sep 2 2015, 17:36:25) [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> >>> x = [1,2,3,4,5] >>> import math >>> y = math.sin >>> >>> import pickle >>> f = open('foo.pik', 'w') >>> pickle.dump(x, f) >>> pickle.dump(y, f) >>> f.close() >>> dude@hilbert>$ python3.5 Python 3.5.0 (default, Sep 15 2015, 23:57:10) [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import pickle >>> with open('foo.pik', 'rb') as f: ... x = pickle.load(f) ... y = pickle.load(f) ... >>> x [1, 2, 3, 4, 5] >>> y <built-in function sin> </code></pre> Also, if you are looking for <code>cPickle</code>, it's now <code>_pickle</code>, not <code>pickle</code>. <pre class="prettyprint"><code>>>> import _pickle >>> _pickle <module '_pickle' from '/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib-dynload/_pickle.cpython-35m-darwin.so'> >>> </code></pre> You also asked how to stop <code>pickle</code> from using the built-in (C++) version. You can do this by using <code>_dump</code> and <code>_load</code>, or the <code>_Pickler</code> class if you like to work with the class objects. Confused? The old <code>cPickle</code> is now <code>_pickle</code>, however <code>dump</code>, <code>load</code>, <code>dumps</code>, and <code>loads</code> all point to <code>_pickle</code>… while <code>_dump</code>, <code>_load</code>, <code>_dumps</code>, and <code>_loads</code> point to the pure python version. For instance: <pre class="prettyprint"><code>>>> import pickle >>> # _dumps is a python function >>> pickle._dumps <function _dumps at 0x109c836a8> >>> # dumps is a built-in (C++) >>> pickle.dumps <built-in function dumps> >>> # the Pickler points to _pickle (C++) >>> pickle.Pickler <class '_pickle.Pickler'> >>> # the _Pickler points to pickle (pure python) >>> pickle._Pickler <class 'pickle._Pickler'> >>> </code></pre> So if you don't want to use the built-in version, then you can use <code>pickle._loads</code> and the like.

In Anaconda Python3.5 : one can access cPickle as <pre class="prettyprint"><code>import _pickle as cPickle </code></pre> credits to Mike McKerns

Python pickle calls cPickle?

Tags:

python

python-3.x

intellij-idea

pickle

I am new to Python. I am adapting someone else's code from Python 2.X to 3.5. The code loads a file via cPickle. I changed all "cPickle" occurrences to "pickle" as I understand pickle superceded cPickle in 3.5. I get this execution error:

NameError: name 'cPickle' is not defined

Pertinent code:

import pickle
import gzip
...
def load_data():
    f = gzip.open('../data/mnist.pkl.gz', 'rb')
    training_data, validation_data, test_data = pickle.load(f, fix_imports=True)
    f.close()
    return (training_data, validation_data, test_data)

The error occurs in the pickle.load line when load_data() is called by another function. However, a) neither cPickle or cpickle no longer appear in any source files anywhere in the project (searched globally) and b) the error does not occur if I run the lines within load_data() individually in the Python shell (however, I do get another data format error). Is pickle calling cPickle, and if so how do I stop it?

Shell: Python 3.5.0 |Anaconda 2.4.0 (x86_64)| (default, Oct 20 2015, 14:39:26) [GCC 4.2.1 (Apple Inc. build 5577)] on darwin

IDE: IntelliJ 15.0.1, Python 3.5.0, anaconda

Unclear how to proceed. Any help appreciated. Thanks.

467

asked Nov 17 '15 00:11

Ron Cohen

3 Answers

Actually, if you have pickled objects from python2.x, then generally can be read by python3.x. Also, if you have pickled objects from python3.x, they generally can be read by python2.x, but only if they were dumped with a protocol set to 2 or less.

Python 2.7.10 (default, Sep  2 2015, 17:36:25) 
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> x = [1,2,3,4,5]
>>> import math
>>> y = math.sin
>>>     
>>> import pickle 
>>> f = open('foo.pik', 'w') 
>>> pickle.dump(x, f)
>>> pickle.dump(y, f)
>>> f.close()
>>> 
dude@hilbert>$ python3.5
Python 3.5.0 (default, Sep 15 2015, 23:57:10) 
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> with open('foo.pik', 'rb') as f:
...   x = pickle.load(f)
...   y = pickle.load(f)
... 
>>> x
[1, 2, 3, 4, 5]
>>> y
<built-in function sin>

Also, if you are looking for cPickle, it's now _pickle, not pickle.

>>> import _pickle
>>> _pickle
<module '_pickle' from '/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib-dynload/_pickle.cpython-35m-darwin.so'>
>>>

You also asked how to stop pickle from using the built-in (C++) version. You can do this by using _dump and _load, or the _Pickler class if you like to work with the class objects. Confused? The old cPickle is now _pickle, however dump, load, dumps, and loads all point to _pickle… while _dump, _load, _dumps, and _loads point to the pure python version. For instance:

>>> import pickle
>>> # _dumps is a python function
>>> pickle._dumps
<function _dumps at 0x109c836a8>
>>> # dumps is a built-in (C++)
>>> pickle.dumps
<built-in function dumps>
>>> # the Pickler points to _pickle (C++)
>>> pickle.Pickler 
<class '_pickle.Pickler'>
>>> # the _Pickler points to pickle (pure python)
>>> pickle._Pickler
<class 'pickle._Pickler'>
>>>

So if you don't want to use the built-in version, then you can use pickle._loads and the like.

184

answered Nov 06 '22 04:11

Mike McKerns

It's looking like the pickled data that you're trying to load was generated by a version of the program that was running on Python 2.7. The data is what contains the references to cPickle.

The problem is that Pickle, as a serialization format, assumes that your standard library (and to a lesser extent your code) won't change layout between serialization and deserialization. Which it did -- a lot -- between Python 2 and 3. And when that happens, Pickle has no path for migration.

Do you have access to the program that generated mnist.pkl.gz? If so, port it to Python 3 and re-run it to regenerate a Python 3-compatible version of the file.

If not, you'll have to write a Python 2 program that loads that file and exports it to a format that can be loaded from Python 3 (depending on the shape of your data, JSON and CSV are popular choices), then write a Python 3 program that loads that format then dumps it as Python 3 pickle. You can then load that Pickle file from your original program.

Of course, what you should really do is stop at the point where you have ability to load the exported format from Python 3 -- and use the aforementioned format as your actual, long-term storage format.

Using Pickle for anything other than short-term serialization between trusted programs (loading Pickle is equivalent to running arbitrary code in your Python VM) is something you should actively avoid, among other things because of the exact case you find yourself in.

answered Nov 06 '22 03:11

Max Noel

In Anaconda Python3.5 : one can access cPickle as

import _pickle as cPickle

credits to Mike McKerns

answered Nov 06 '22 03:11

Shravan Kumar

Related questions
                            
                                How to find cluster sizes in 2D numpy array?
                            
                                How does a python process exit gracefully after receiving SIGTERM while waiting on a semaphore?
                            
                                Is there an alternate for the now removed module 'nltk.model.NGramModel'?
                            
                                When to call Python's super().__init__()?
                            
                                How can I select 'last business day of the month' in Pandas?
                            
                                How to use Non-Standard Custom Font with Stylesheets?
                            
                                ForeignKeys clashing when using abstract multiple inheritance in Django
                            
                                How to to make a file private by securing the url that only authenticated users can see
                            
                                Phong shading for shiny Python 3D surface plots
                            
                                how to correctly check for scroll end?
                            
                                how to get the line number of an error from exec or execfile in Python
                            
                                Getting legend in seaborn jointplot
                            
                                app engine: ImportError: No module named Crypto.Hash
                            
                                Python removing punctuation from unicode string except apostrophe
                            
                                Snippets vs. Abbreviations in Vim
                            
                                My Django installs in virtual env are missing admin templates folder
                            
                                Detecting lines and shapes in OpenCV using Python
                            
                                Python Django Asynchronous Request handling
                            
                                Test if two numpy arrays are (close to) equal, including shape
                            
                                DjangoRestFramework - Omit null fields when serializing objects

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With