I want to be able to save my array subclass to a npy file, and recover the result later.
Something like:
>>> class MyArray(np.ndarray): pass
>>> data = MyArray(np.arange(10))
>>> np.save('fname', data)
>>> data2 = np.load('fname')
>>> assert isinstance(data2, MyArray) # raises AssertionError
the docs says (emphasis mine):
The format explicitly does not need to:
- [...]
- Fully handle arbitrary subclasses of numpy.ndarray. Subclasses will be accepted for writing, but only the array data will be written out. A regular numpy.ndarray object will be created upon reading the file. The API can be used to build a format for a particular subclass, but that is out of scope for the general NPY format.
So is it possible to make the above code not raise an AssertionError?
You can save your NumPy arrays to CSV files using the savetxt() function. This function takes a filename and array as arguments and saves the array into CSV format. You must also specify the delimiter; this is the character used to separate each variable in the file, most commonly a comma.
NumPy N-dimensional ArrayThe main data structure in NumPy is the ndarray, which is a shorthand name for N-dimensional array. When working with NumPy, data in an ndarray is simply referred to as an array. It is a fixed-sized array in memory that contains data of the same type, such as integers or floating point values.
An ndarray is a (usually fixed-size) multidimensional container of items of the same type and size. The number of dimensions and items in an array is defined by its shape , which is a tuple of N non-negative integers that specify the sizes of each dimension.
Save in npy format using Numpy save()save() method from Numpy. Running this line of code will save your array to a binary file with the name 'ask_python. npy'.
I don't see evidence that np.save
handles array subclasses.
I tried to save a np.matrix
with it, and got back a ndarray
.
I tried to save a np.ma
array, and got an error
NotImplementedError: MaskedArray.tofile() not implemented yet.
Saving is done by np.lib.npyio.format.write_array
, which does
_write_array_header() # save dtype, shape etc
if dtype
is object it uses pickle.dump(array, fp ...)
otherwise it does array.tofile(fp)
. tofile
handles writing the data buffer.
I think pickle.dump
of an array ends up using np.save
, but I don't recall how that's triggered.
I can for example pickle
an array, and load it:
In [657]: f=open('test','wb')
In [658]: pickle.Pickler(f).dump(x)
In [659]: f.close()
In [660]: np.load('test')
In [664]: f=open('test','rb')
In [665]: pickle.load(f)
This pickle
dump/load sequence works for test np.ma
, np.matrix
and sparse.coo_matrix
cases. So that's probably the direction to explore for your own subclass.
Searching on numpy
and pickle
I found Preserve custom attributes when pickling subclass of numpy array. The answer involves a custom .__reduce__
and .__setstate__
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With