How do you save/load a scipy sparse csr_matrix
in a portable format? The scipy sparse matrix is created on Python 3 (Windows 64-bit) to run on Python 2 (Linux 64-bit). Initially, I used pickle (with protocol=2 and fix_imports=True) but this didn't work going from Python 3.2.2 (Windows 64-bit) to Python 2.7.2 (Windows 32-bit) and got the error:
TypeError: ('data type not understood', <built-in function _reconstruct>, (<type 'numpy.ndarray'>, (0,), '[98]')).
Next, tried numpy.save
and numpy.load
as well as scipy.io.mmwrite()
and scipy.io.mmread()
and none of these methods worked either.
Save a sparse matrix to a file using . npz format. Either the file name (string) or an open file (file-like object) where the data will be saved.
The function csr_matrix() is used to create a sparse matrix of compressed sparse row format whereas csc_matrix() is used to create a sparse matrix of compressed sparse column format.
One of the ways to save the sparse matrix is to save them as Mtx file, that stores matrix in MatrixMarket format. We can use writeMM function to save the sparse matrix object into a file.
The compressed sparse row (CSR) or compressed row storage (CRS) or Yale format represents a matrix M by three (one-dimensional) arrays, that respectively contain nonzero values, the extents of rows, and column indices. It is similar to COO, but compresses the row indices, hence the name.
edit: scipy 0.19 now has scipy.sparse.save_npz
and scipy.sparse.load_npz
.
from scipy import sparse sparse.save_npz("yourmatrix.npz", your_matrix) your_matrix_back = sparse.load_npz("yourmatrix.npz")
For both functions, the file
argument may also be a file-like object (i.e. the result of open
) instead of a filename.
Got an answer from the Scipy user group:
A csr_matrix has 3 data attributes that matter:
.data
,.indices
, and.indptr
. All are simple ndarrays, sonumpy.save
will work on them. Save the three arrays withnumpy.save
ornumpy.savez
, load them back withnumpy.load
, and then recreate the sparse matrix object with:
new_csr = csr_matrix((data, indices, indptr), shape=(M, N))
So for example:
def save_sparse_csr(filename, array): np.savez(filename, data=array.data, indices=array.indices, indptr=array.indptr, shape=array.shape) def load_sparse_csr(filename): loader = np.load(filename) return csr_matrix((loader['data'], loader['indices'], loader['indptr']), shape=loader['shape'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With