Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Load sparse array from npy file

I am trying load a sparse array that I have previously saved. Saving the sparse array was easy enough. Trying to read it though is a pain. scipy.load returns a 0d array around my sparse array.

import scipy as sp
A = sp.load("my_array"); A
array(<325729x325729 sparse matrix of type '<type 'numpy.int8'>'
with 1497134 stored elements in Compressed Sparse Row format>, dtype=object)

In order to get a sparse matrix I have to flatten the 0d array, or use sp.asarray(A). This seems like a really hard way to do things. Is Scipy smart enough to understand that it has loaded a sparse array? Is there a better way to load a sparse array?

like image 666
iform Avatar asked Jun 08 '11 16:06

iform


People also ask

How extract NPY file in Python?

Just use np. load(filename, mmap_mode='r') .

How do you store a sparse matrix in Python?

The function csr_matrix() is used to create a sparse matrix of compressed sparse row format whereas csc_matrix() is used to create a sparse matrix of compressed sparse column format.

How do you convert to sparse matrix?

Description. S = sparse( A ) converts a full matrix into sparse form by squeezing out any zero elements. If a matrix contains many zeros, converting the matrix to sparse storage saves memory. S = sparse( m,n ) generates an m -by- n all zero sparse matrix.


2 Answers

For all the up votes of the mmwrite answer, I'm surprised no one tried to answer the actual question. But since it has been reactivated, I'll give it a try.

This reproduces the OP case:

In [90]: x=sparse.csr_matrix(np.arange(10).reshape(2,5))
In [91]: np.save('save_sparse.npy',x)
In [92]: X=np.load('save_sparse.npy')
In [95]: X
Out[95]: 
array(<2x5 sparse matrix of type '<type 'numpy.int32'>'
    with 9 stored elements in Compressed Sparse Row format>, dtype=object)
In [96]: X[()].A
Out[96]: 
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [93]: X[()].A
Out[93]: 
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
In [94]: x
Out[94]: 
<2x5 sparse matrix of type '<type 'numpy.int32'>'
    with 9 stored elements in Compressed Sparse Row format

The [()] that `user4713166 gave us is not a 'hard way' to extract the sparse array.

np.save and np.load are designed to operate on ndarrays. But a sparse matrix is not such an array, nor is it a subclass (as np.matrix is). It appears that np.save wraps the non-array object in an object dtype array, and saves it along with a pickled form of the object.

When I try to save a different kind of object, one that can't be pickled, I get an error message at:

403  # We contain Python objects so we cannot write out the data directly.
404  # Instead, we will pickle it out with version 2 of the pickle protocol.

--> 405 pickle.dump(array, fp, protocol=2)

So in answer to Is Scipy smart enough to understand that it has loaded a sparse array?, no. np.load does not know about sparse arrays. But np.save is smart enough to punt when given something that isn't an array, and np.load does what it can with what if finds in the file.

As to alternative methods of saving and loading sparse arrays, the io.savemat, MATLAB compatible method, has been mentioned. It would be my first choice. But this example also shows that you can use the regular Python pickling. That might be better if you need to save a particular sparse format. And np.save isn't bad if you can live with the [()] extraction step. :)


https://github.com/scipy/scipy/blob/master/scipy/io/matlab/mio5.py write_sparse - sparse are saved in csc format. Along with headers it saves A.indices.astype('i4')), A.indptr.astype('i4')), A.data.real, and optionally A.data.imag.


In quick tests I find that np.save/load handles all sparse formats, except dok, where the load complains about a missing shape. Otherwise I'm not finding any special pickling code in the sparse files.

like image 70
hpaulj Avatar answered Nov 03 '22 19:11

hpaulj


One can extract the object hidden away in the 0d array using () as index:

A = sp.load("my_array")[()]

This looks weird, but it seems to work anyway, and it is a very short workaround.

like image 29
user4713166 Avatar answered Nov 03 '22 18:11

user4713166