I am trying to save a large numpy array and reload it. Using numpy.save
and numpy.load
, the array values are corrupted/change. The shape and data type of the array pre-saving, and post-loading, are the same, but the post-loading array has the vast majority of the values zeroed.
The array is (22915,22915), values are float64's, takes 3.94 gb's as a .npy file, and the data entries average about .1 (not tiny floats that might reasonably get converted to zeroes). I am using numpy 1.5.1.
Any help on why this corruption is occurring would be greatly appreciated because I am at a loss. Below is some code providing evidence of the claims above.
In [7]: m
Out[7]:
array([[ 0. , 0.02023, 0.00703, ..., 0.02362, 0.02939, 0.03656],
[ 0.02023, 0. , 0.0135 , ..., 0.04357, 0.04934, 0.05651],
[ 0.00703, 0.0135 , 0. , ..., 0.03037, 0.03614, 0.04331],
...,
[ 0.02362, 0.04357, 0.03037, ..., 0. , 0.01797, 0.02514],
[ 0.02939, 0.04934, 0.03614, ..., 0.01797, 0. , 0.01919],
[ 0.03656, 0.05651, 0.04331, ..., 0.02514, 0.01919, 0. ]])
In [8]: m.shape
Out[8]: (22195, 22195)
In [12]: save('/Users/will/Desktop/m.npy',m)
In [14]: lm = load('/Users/will/Desktop/m.npy')
In [15]: lm
Out[15]:
array([[ 0. , 0.02023, 0.00703, ..., 0. , 0. , 0. ],
[ 0. , 0. , 0. , ..., 0. , 0. , 0. ],
[ 0. , 0. , 0. , ..., 0. , 0. , 0. ],
...,
[ 0. , 0. , 0. , ..., 0. , 0. , 0. ],
[ 0. , 0. , 0. , ..., 0. , 0. , 0. ],
[ 0. , 0. , 0. , ..., 0. , 0. , 0. ]])
In [17]: type(lm[0][0])
Out[17]: numpy.float64
In [18]: type(m[0][0])
Out[18]: numpy.float64
In [19]: lm.shape
Out[19]: (22195, 22195)
This is a known issue (note that that links against numpy 1.4). If you really can't upgrade, my advice would be to try to save in a different way (savez, savetxt). If getbuffer is available you can try to write the bytes directly. If all else fails (and you can't upgrade), you can write your own save function pretty easily.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With