Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3 pickle load from Python 2

I have a pickle file that was created (I don't know how exactly) in python 2. It is intended to be loaded by the following python 2 lines, which when used in python 3 (unsurprisingly) do not work:

with open('filename','r') as f:
    foo, bar = pickle.load(f)

Result:

'ascii' codec can't decode byte 0xc2 in position 1219: ordinal not in range(128)

Manual inspection of the file indicates it is utf-8 encoded, therefore:

with open('filename','r', encoding='utf-8') as f:
    foo, bar = pickle.load(f)

Result:

TypeError: a bytes-like object is required, not 'str'

With binary encoding:

with open('filename','rb', encoding='utf-8') as f:
    foo, bar = pickle.load(f)

Result:

ValueError: binary mode doesn't take an encoding argument

Without binary encoding:

with open('filename','rb') as f:
    foo, bar = pickle.load(f)

Result:

UnpicklingError: invalid load key, ' '.

Is this pickle file just broken? If not, how can I pry this thing open in python 3? (I have browsed the extensive collection of related questions and not found anything that works yet.)

Finally, note that the original

import cPickle as pickle

has been replaced with

import _pickle as pickle

like image 666
Novak Avatar asked May 11 '18 00:05

Novak


1 Answers

The loading of python2 pickles in python3 (version 3.7.2 in this example) can be helped using the fix_imports parameter in the pickle.load function, but in my case it also worked without setting that parameter to True.

I was attempting to load a scipy.sparse.csr.csr_matrix contained in pickle generated using Python2.

When inspecting the file format using the UNIX command file it says:

>file -bi python2_generated.pckl
application/octet-stream; charset=binary

I could load the pickle in Python3 using the following code:

with open("python2_generated.pckl", "rb") as fd:
    bh01 = pickle.load(fd, fix_imports=True, encoding="latin1")

Note that the loading was successful with and without setting fix_imports to True As for the "latin1" encoding, the Python3 documentation (version 3.7.2) for the pickle.load function says: Using encoding='latin1' is required for unpickling NumPy arrays and instances of datetime, date and time pickled by Python 2

Although this is specifically for scipy matrixes (or Numpy arrays), and since Novak is not clarifing what his pickle file contained, I hope this could of help to other users :)

like image 82
Salvatore Cosentino Avatar answered Oct 16 '22 21:10

Salvatore Cosentino