I am trying load a sparse array that I have previously saved. Saving the sparse array was easy enough. Trying to read it though is a pain. scipy.load returns a 0d array around my sparse array.
import scipy as sp
A = sp.load("my_array"); A
array(<325729x325729 sparse matrix of type '<type 'numpy.int8'>'
with 1497134 stored elements in Compressed Sparse Row format>, dtype=object)
In order to get a sparse matrix I have to flatten the 0d array, or use sp.asarray(A). This seems like a really hard way to do things. Is Scipy smart enough to understand that it has loaded a sparse array? Is there a better way to load a sparse array?
Just use np. load(filename, mmap_mode='r') .
The function csr_matrix() is used to create a sparse matrix of compressed sparse row format whereas csc_matrix() is used to create a sparse matrix of compressed sparse column format.
Description. S = sparse( A ) converts a full matrix into sparse form by squeezing out any zero elements. If a matrix contains many zeros, converting the matrix to sparse storage saves memory. S = sparse( m,n ) generates an m -by- n all zero sparse matrix.
For all the up votes of the mmwrite
answer, I'm surprised no one tried to answer the actual question. But since it has been reactivated, I'll give it a try.
This reproduces the OP case:
In [90]: x=sparse.csr_matrix(np.arange(10).reshape(2,5))
In [91]: np.save('save_sparse.npy',x)
In [92]: X=np.load('save_sparse.npy')
In [95]: X
Out[95]:
array(<2x5 sparse matrix of type '<type 'numpy.int32'>'
with 9 stored elements in Compressed Sparse Row format>, dtype=object)
In [96]: X[()].A
Out[96]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [93]: X[()].A
Out[93]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
In [94]: x
Out[94]:
<2x5 sparse matrix of type '<type 'numpy.int32'>'
with 9 stored elements in Compressed Sparse Row format
The [()]
that `user4713166 gave us is not a 'hard way' to extract the sparse array.
np.save
and np.load
are designed to operate on ndarrays. But a sparse matrix is not such an array, nor is it a subclass (as np.matrix
is). It appears that np.save
wraps the non-array object in an object dtype array
, and saves it along with a pickled form of the object.
When I try to save a different kind of object, one that can't be pickled, I get an error message at:
403 # We contain Python objects so we cannot write out the data directly.
404 # Instead, we will pickle it out with version 2 of the pickle protocol.
--> 405 pickle.dump(array, fp, protocol=2)
So in answer to Is Scipy smart enough to understand that it has loaded a sparse array?
, no. np.load
does not know about sparse arrays. But np.save
is smart enough to punt when given something that isn't an array, and np.load
does what it can with what if finds in the file.
As to alternative methods of saving and loading sparse arrays, the io.savemat
, MATLAB compatible method, has been mentioned. It would be my first choice. But this example also shows that you can use the regular Python pickling
. That might be better if you need to save a particular sparse format. And np.save
isn't bad if you can live with the [()]
extraction step. :)
https://github.com/scipy/scipy/blob/master/scipy/io/matlab/mio5.py
write_sparse
- sparse are saved in csc
format. Along with headers it saves A.indices.astype('i4'))
, A.indptr.astype('i4'))
, A.data.real
, and optionally A.data.imag
.
In quick tests I find that np.save/load
handles all sparse formats, except dok
, where the load
complains about a missing shape
. Otherwise I'm not finding any special pickling code in the sparse files.
One can extract the object hidden away in the 0d array using () as index:
A = sp.load("my_array")[()]
This looks weird, but it seems to work anyway, and it is a very short workaround.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With