I want to be able to save my array subclass to a npy file, and recover the result later. Something like: <pre class="prettyprint"><code>>>> class MyArray(np.ndarray): pass >>> data = MyArray(np.arange(10)) >>> np.save('fname', data) >>> data2 = np.load('fname') >>> assert isinstance(data2, MyArray) # raises AssertionError </code></pre> the docs says (emphasis mine): <blockquote> The format explicitly does not need to: <ul> <li>[...]</li> <li>Fully handle arbitrary subclasses of numpy.ndarray. Subclasses will be accepted for writing, but only the array data will be written out. A regular numpy.ndarray object will be created upon reading the file. The API can be used to build a format for a particular subclass, but that is out of scope for the general NPY format.</li> </ul> </blockquote> So is it possible to make the above code not raise an AssertionError?

I don't see evidence that <code>np.save</code> handles array subclasses. I tried to save a <code>np.matrix</code> with it, and got back a <code>ndarray</code>. I tried to save a <code>np.ma</code> array, and got an error <pre class="prettyprint"><code>NotImplementedError: MaskedArray.tofile() not implemented yet. </code></pre> Saving is done by <code>np.lib.npyio.format.write_array</code>, which does <pre class="prettyprint"><code>_write_array_header() # save dtype, shape etc </code></pre> if <code>dtype</code> is object it uses <code>pickle.dump(array, fp ...)</code> otherwise it does <code>array.tofile(fp)</code>. <code>tofile</code> handles writing the data buffer. I think <code>pickle.dump</code> of an array ends up using <code>np.save</code>, but I don't recall how that's triggered. I can for example <code>pickle</code> an array, and load it: <pre class="prettyprint"><code>In [657]: f=open('test','wb') In [658]: pickle.Pickler(f).dump(x) In [659]: f.close() In [660]: np.load('test') In [664]: f=open('test','rb') In [665]: pickle.load(f) </code></pre> This <code>pickle</code> dump/load sequence works for test <code>np.ma</code>, <code>np.matrix</code> and <code>sparse.coo_matrix</code> cases. So that's probably the direction to explore for your own subclass. Searching on <code>numpy</code> and <code>pickle</code> I found Preserve custom attributes when pickling subclass of numpy array. The answer involves a custom <code>.__reduce__</code> and <code>.__setstate__</code>.

How can I make np.save work for an ndarray subclass?

Tags:

python

numpy

I want to be able to save my array subclass to a npy file, and recover the result later.

Something like:

>>> class MyArray(np.ndarray): pass
>>> data = MyArray(np.arange(10))
>>> np.save('fname', data)
>>> data2 = np.load('fname')
>>> assert isinstance(data2, MyArray)  # raises AssertionError

the docs says (emphasis mine):

The format explicitly does not need to:

[...]

Fully handle arbitrary subclasses of numpy.ndarray. Subclasses will be accepted for writing, but only the array data will be written out. A regular numpy.ndarray object will be created upon reading the file. The API can be used to build a format for a particular subclass, but that is out of scope for the general NPY format.

So is it possible to make the above code not raise an AssertionError?

264

asked Aug 08 '16 22:08

Eric

1 Answers

I don't see evidence that np.save handles array subclasses.

I tried to save a np.matrix with it, and got back a ndarray.

I tried to save a np.ma array, and got an error

NotImplementedError: MaskedArray.tofile() not implemented yet.

Saving is done by np.lib.npyio.format.write_array, which does

_write_array_header()   # save dtype, shape etc

if dtype is object it uses pickle.dump(array, fp ...)

otherwise it does array.tofile(fp). tofile handles writing the data buffer.

I think pickle.dump of an array ends up using np.save, but I don't recall how that's triggered.

I can for example pickle an array, and load it:

In [657]: f=open('test','wb')
In [658]: pickle.Pickler(f).dump(x)
In [659]: f.close()
In [660]: np.load('test')
In [664]: f=open('test','rb')
In [665]: pickle.load(f)

This pickle dump/load sequence works for test np.ma, np.matrix and sparse.coo_matrix cases. So that's probably the direction to explore for your own subclass.

Searching on numpy and pickle I found Preserve custom attributes when pickling subclass of numpy array. The answer involves a custom .__reduce__ and .__setstate__.

113

answered Sep 21 '22 21:09

hpaulj

Related questions
                            
                                influxdb python: 404 page not found
                            
                                Non-monotonic memory consumption in Python2 dictionaries
                            
                                Can I insert a line into ruamel.yaml's CommentedMap?
                            
                                How to mock a property inside a class in Python
                            
                                How to install firefoxdriver webdriver for python3 selenium on ubuntu?
                            
                                concatenation of two or more base64 strings in python
                            
                                inserting millions of documents - mongo / pymongo - insert_many
                            
                                Django-registration resend activation Email with new code
                            
                                pip freeze: show only packages installed via pip
                            
                                ipython autoreload doesn't work
                            
                                AttributeError, 'dict' object has no attribute 'iteritems'; Flask-SQLAlchemy error while committing to database
                            
                                How does a Python module that contains class of same name work when imported?
                            
                                PySpark: retrieve mean and the count of values around the mean for groups within a dataframe
                            
                                How to resolve this error? "RestartFreqExceeded: 5 in 1s" in django+celery+rabbitmq+mysql+redis
                            
                                get encoding specified in magic line / shebang (from within module)
                            
                                Packages missing in current osx-64 and channels
                            
                                Pyglet HUD text location / scaling
                            
                                S3Cmd doesn't work with S3 Ninja
                            
                                Passing session from template view to python requests api call
                            
                                Is it possible to add a value named 'None' to enum type?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With