Suppose I have a NxN matrix M (lil_matrix or csr_matrix) from scipy.sparse, and I want to make it (N+1)xN where M_modified[i,j] = M[i,j] for 0 <= i < N (and all j) and M[N,j] = 0 for all j. Basically, I want to add a row of zeros to the bottom of M and preserve the remainder of the matrix. Is there a way to do this without copying the data?
Python's SciPy provides tools for creating sparse matrices using multiple data structures, as well as tools for converting a dense matrix to a sparse matrix. The sparse matrix representation outputs the row-column tuple where the matrix contains non-zero values along with those values. 15. 1. import numpy as np.
The function csr_matrix() is used to create a sparse matrix of compressed sparse row format whereas csc_matrix() is used to create a sparse matrix of compressed sparse column format.
A simple and efficient way to add sparse matrices is to convert them to sparse triplet form, concatenate the triplets, and then convert back to sparse column format.
lil_matrix((M, N), [dtype]) to construct an empty matrix with shape (M, N) dtype is optional, defaulting to dtype='d'. Notes. Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.
Scipy doesn't have a way to do this without copying the data but you can do it yourself by changing the attributes that define the sparse matrix.
There are 4 attributes that make up the csr_matrix:
data: An array containing the actual values in the matrix
indices: An array containing the column index corresponding to each value in data
indptr: An array that specifies the index before the first value in data for each row. If the row is empty then the index is the same as the previous column.
shape: A tuple containing the shape of the matrix
If you are simply adding a row of zeros to the bottom all you have to do is change the shape and indptr for your matrix.
x = np.ones((3,5)) x = csr_matrix(x) x.toarray() >> array([[ 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1.]]) # reshape is not implemented for csr_matrix but you can cheat and do it yourself. x._shape = (4,5) # Update indptr to let it know we added a row with nothing in it. So just append the last # value in indptr to the end. # note that you are still copying the indptr array x.indptr = np.hstack((x.indptr,x.indptr[-1])) x.toarray() array([[ 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1.], [ 1., 1., 1., 1., 1.], [ 0., 0., 0., 0., 0.]])
Here is a function to handle the more general case of vstacking any 2 csr_matrices. You still end up copying the underlying numpy arrays but it is still significantly faster than the scipy vstack method.
def csr_vappend(a,b): """ Takes in 2 csr_matrices and appends the second one to the bottom of the first one. Much faster than scipy.sparse.vstack but assumes the type to be csr and overwrites the first matrix instead of copying it. The data, indices, and indptr still get copied.""" a.data = np.hstack((a.data,b.data)) a.indices = np.hstack((a.indices,b.indices)) a.indptr = np.hstack((a.indptr,(b.indptr + a.nnz)[1:])) a._shape = (a.shape[0]+b.shape[0],b.shape[1]) return a
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With