I am trying to save a dataframe and a matrix as .npy files with np.save() and then read them using np.load() but I get the following error:
File "/Users/sofiafarina/opt/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 457, in load
raise ValueError("Cannot load file containing pickled data "
ValueError: Cannot load file containing pickled data when allow_pickle=False
Even if I write allow_pickle=True I get an error:
File "/Users/sofiafarina/opt/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 463, in load
"Failed to interpret file %s as a pickle" % repr(file))
OSError: Failed to interpret file 'finaldf_p_85_12.npy' as a pickle
So how could I save a df from a python script and then load it in another one? Should I use other functions? Thank you!
I used the syntax below to load the .npy
file and it worked.
np.load("finaldf_p_85_12.npy",allow_pickle=True)
I think you need to add allow_pickle=True
parameter.
TLDR;
After hundreds of search and hours of debugging I found out that the issue was with git-lfs, my files did not get pulled using git-lfs.
git lfs install
git lfs pull
I think numpy needs to report this correctly
I had the exact same issue. dtype
in my .npz file was uint8
, so not an Object, technically allow_pickle should not be required. My numpy version is 1.20.x
Got the following when using allow_pickle=False
ValueError: Cannot load file containing pickled data when allow_pickle=False
And with allow_pickle=True
I got
OSError: Failed to interpret file 'finaldf_p_85_12.npy' as a pickle
Python uses a native data serialization module called Pickle. Nested data (like a list of lists) is serialized using pickle and NumPy warns against pickling.
Warning: Loading files that contain object arrays uses the pickle module, which is not secure against erroneous or maliciously constructed data. Consider passing allow_pickle=False to load data that is known not to contain object arrays for the safer handling of untrusted sources.
You might be saving an array which consists a single dataFrame. This causes pickling. Example:
x = array([[ 0.1, 0.1, 0.1],
[ 0.1, 0.1, 0.1],
[ 0.1, 0.1, 0.1],
[ 0.1, 0.1, 0.1],
[ 0.1, 0.1, 0.1],
[ 0.1, 0.1, 0.1],
[ 0.1, 0.1, 0.1]])
In that case, try saving just the numpy array as np.save(filename, x[0])
. This will not use any pickling to save your data and resolves the issue.
The OSError suggests you could be having a python 2/python 3 issue. I had the same problem and errors when I was trying to read a file with python 3 that had been written in python 2. For me, using the np.load command with the following arguments worked:
np.load('file.npy',allow_pickle=True,fix_imports=True,encoding='latin1')
The doc for numpy.load says about the encoding argument, "Only useful when loading Python 2 generated pickled files in Python 3, which includes npy/npz files containing object arrays."
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With