Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

expanding (adding a row or column) a scipy.sparse matrix

Suppose I have a NxN matrix M (lil_matrix or csr_matrix) from scipy.sparse, and I want to make it (N+1)xN where M_modified[i,j] = M[i,j] for 0 <= i < N (and all j) and M[N,j] = 0 for all j. Basically, I want to add a row of zeros to the bottom of M and preserve the remainder of the matrix. Is there a way to do this without copying the data?

like image 255
RandomGuy Avatar asked Jan 14 '11 19:01

RandomGuy


People also ask

What is the SciPy function which creates a sparse matrix?

Python's SciPy provides tools for creating sparse matrices using multiple data structures, as well as tools for converting a dense matrix to a sparse matrix. The sparse matrix representation outputs the row-column tuple where the matrix contains non-zero values along with those values. 15. 1. import numpy as np.

What does SciPy sparse Csr_matrix do?

The function csr_matrix() is used to create a sparse matrix of compressed sparse row format whereas csc_matrix() is used to create a sparse matrix of compressed sparse column format.

How do you add sparse matrices?

A simple and efficient way to add sparse matrices is to convert them to sparse triplet form, concatenate the triplets, and then convert back to sparse column format.

What is Lil_matrix?

lil_matrix((M, N), [dtype]) to construct an empty matrix with shape (M, N) dtype is optional, defaulting to dtype='d'. Notes. Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.


1 Answers

Scipy doesn't have a way to do this without copying the data but you can do it yourself by changing the attributes that define the sparse matrix.

There are 4 attributes that make up the csr_matrix:

data: An array containing the actual values in the matrix

indices: An array containing the column index corresponding to each value in data

indptr: An array that specifies the index before the first value in data for each row. If the row is empty then the index is the same as the previous column.

shape: A tuple containing the shape of the matrix

If you are simply adding a row of zeros to the bottom all you have to do is change the shape and indptr for your matrix.

x = np.ones((3,5)) x = csr_matrix(x) x.toarray() >> array([[ 1.,  1.,  1.,  1.,  1.],           [ 1.,  1.,  1.,  1.,  1.],           [ 1.,  1.,  1.,  1.,  1.]]) # reshape is not implemented for csr_matrix but you can cheat and do it  yourself. x._shape = (4,5) # Update indptr to let it know we added a row with nothing in it. So just append the last # value in indptr to the end. # note that you are still copying the indptr array x.indptr = np.hstack((x.indptr,x.indptr[-1])) x.toarray() array([[ 1.,  1.,  1.,  1.,  1.],        [ 1.,  1.,  1.,  1.,  1.],        [ 1.,  1.,  1.,  1.,  1.],        [ 0.,  0.,  0.,  0.,  0.]]) 

Here is a function to handle the more general case of vstacking any 2 csr_matrices. You still end up copying the underlying numpy arrays but it is still significantly faster than the scipy vstack method.

def csr_vappend(a,b):     """ Takes in 2 csr_matrices and appends the second one to the bottom of the first one.      Much faster than scipy.sparse.vstack but assumes the type to be csr and overwrites     the first matrix instead of copying it. The data, indices, and indptr still get copied."""      a.data = np.hstack((a.data,b.data))     a.indices = np.hstack((a.indices,b.indices))     a.indptr = np.hstack((a.indptr,(b.indptr + a.nnz)[1:]))     a._shape = (a.shape[0]+b.shape[0],b.shape[1])     return a 
like image 143
JakeM Avatar answered Oct 18 '22 14:10

JakeM