Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I make np.save work for an ndarray subclass?

Tags:

python

numpy

I want to be able to save my array subclass to a npy file, and recover the result later.

Something like:

>>> class MyArray(np.ndarray): pass
>>> data = MyArray(np.arange(10))
>>> np.save('fname', data)
>>> data2 = np.load('fname')
>>> assert isinstance(data2, MyArray)  # raises AssertionError

the docs says (emphasis mine):

The format explicitly does not need to:

  • [...]
  • Fully handle arbitrary subclasses of numpy.ndarray. Subclasses will be accepted for writing, but only the array data will be written out. A regular numpy.ndarray object will be created upon reading the file. The API can be used to build a format for a particular subclass, but that is out of scope for the general NPY format.

So is it possible to make the above code not raise an AssertionError?

like image 264
Eric Avatar asked Aug 08 '16 22:08

Eric


People also ask

How do I save a NumPy Ndarray file?

You can save your NumPy arrays to CSV files using the savetxt() function. This function takes a filename and array as arguments and saves the array into CSV format. You must also specify the delimiter; this is the character used to separate each variable in the file, most commonly a comma.

Is Ndarray the same as NP array?

NumPy N-dimensional ArrayThe main data structure in NumPy is the ndarray, which is a shorthand name for N-dimensional array. When working with NumPy, data in an ndarray is simply referred to as an array. It is a fixed-sized array in memory that contains data of the same type, such as integers or floating point values.

How do you define an Ndarray in NumPy?

An ndarray is a (usually fixed-size) multidimensional container of items of the same type and size. The number of dimensions and items in an array is defined by its shape , which is a tuple of N non-negative integers that specify the sizes of each dimension.

How do I save NumPy arrays to NPY formatted files?

Save in npy format using Numpy save()save() method from Numpy. Running this line of code will save your array to a binary file with the name 'ask_python. npy'.


1 Answers

I don't see evidence that np.save handles array subclasses.

I tried to save a np.matrix with it, and got back a ndarray.

I tried to save a np.ma array, and got an error

NotImplementedError: MaskedArray.tofile() not implemented yet.

Saving is done by np.lib.npyio.format.write_array, which does

_write_array_header()   # save dtype, shape etc

if dtype is object it uses pickle.dump(array, fp ...)

otherwise it does array.tofile(fp). tofile handles writing the data buffer.

I think pickle.dump of an array ends up using np.save, but I don't recall how that's triggered.

I can for example pickle an array, and load it:

In [657]: f=open('test','wb')
In [658]: pickle.Pickler(f).dump(x)
In [659]: f.close()
In [660]: np.load('test')
In [664]: f=open('test','rb')
In [665]: pickle.load(f)

This pickle dump/load sequence works for test np.ma, np.matrix and sparse.coo_matrix cases. So that's probably the direction to explore for your own subclass.

Searching on numpy and pickle I found Preserve custom attributes when pickling subclass of numpy array. The answer involves a custom .__reduce__ and .__setstate__.

like image 113
hpaulj Avatar answered Sep 21 '22 21:09

hpaulj