Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing diagonal elements from a sparse matrix in scipy

I want to remove diagonal elements from a sparse matrix. Since the matrix is sparse, these elements shouldn't be stored once removed.

Scipy provides a method to set diagonal elements values: setdiag

If I try it using lil_matrix, it works:

>>> a = np.ones((2,2))
>>> c = lil_matrix(a)
>>> c.setdiag(0)
>>> c
<2x2 sparse matrix of type '<type 'numpy.float64'>'
    with 2 stored elements in LInked List format>

However with csr_matrix, it seems diagonal elements are not removed from storage:

>>> b = csr_matrix(a)
>>> b
<2x2 sparse matrix of type '<type 'numpy.float64'>'
    with 4 stored elements in Compressed Sparse Row format>

>>> b.setdiag(0)
>>> b
<2x2 sparse matrix of type '<type 'numpy.float64'>'
    with 4 stored elements in Compressed Sparse Row format>

>>> b.toarray()
array([[ 0.,  1.],
       [ 1.,  0.]])

Through a dense array, we have of course:

>>> csr_matrix(b.toarray())
<2x2 sparse matrix of type '<type 'numpy.float64'>'
    with 2 stored elements in Compressed Sparse Row format>

Is that intended? If so, is it due to the compressed format of csr matrices? Is there any workaround else than going from sparse to dense to sparse again?

like image 471
kevad Avatar asked Nov 23 '16 11:11

kevad


People also ask

How do you extract a diagonal matrix?

D = diag( v ) returns a square diagonal matrix with the elements of vector v on the main diagonal. D = diag( v , k ) places the elements of vector v on the k th diagonal. k=0 represents the main diagonal, k>0 is above the main diagonal, and k<0 is below the main diagonal.

Is diagonal matrix a sparse matrix?

Diagonals containing only zero are omitted. For sparse matrices with very few non-zero diagonals, such as diagonal or tridiagonal matrices, the DIA format allows for very quick arithmetic operations. Its main limitation is that looking up each matrix element requires performing a blind search through the offsets array.

What does Scipy sparse Csr_matrix do?

The function csr_matrix() is used to create a sparse matrix of compressed sparse row format whereas csc_matrix() is used to create a sparse matrix of compressed sparse column format.

What is diagonal sparse matrix?

The tridiagonal regular sparse matrix where all non-zero elements lie on one of the three diagonals, the main diagonal above and below. Storing Tri-diagonal regular sparse matrices. In a tri-diagonal regular sparse matrix, all the non-zero elements are stored in a 1-dimensional array row by row.


1 Answers

Simply setting elements to 0 does not change the sparsity of a csr matrix. You have to apply eliminate_zeros.

In [807]: a=sparse.csr_matrix(np.ones((2,2)))
In [808]: a
Out[808]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>'
    with 4 stored elements in Compressed Sparse Row format>
In [809]: a.setdiag(0)
In [810]: a
Out[810]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>'
    with 4 stored elements in Compressed Sparse Row format>
In [811]: a.eliminate_zeros()
In [812]: a
Out[812]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>'
    with 2 stored elements in Compressed Sparse Row format>

Since changing the sparsity of a csr matrix is relatively expensive, they let you change values to 0 without changing sparsity.

In [829]: %%timeit a=sparse.csr_matrix(np.ones((1000,1000)))
     ...: a.setdiag(0)
100 loops, best of 3: 3.86 ms per loop

In [830]: %%timeit a=sparse.csr_matrix(np.ones((1000,1000)))
     ...: a.setdiag(0)
     ...: a.eliminate_zeros()
SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
10 loops, best of 3: 133 ms per loop

In [831]: %%timeit a=sparse.lil_matrix(np.ones((1000,1000)))
     ...: a.setdiag(0)
100 loops, best of 3: 14.1 ms per loop
like image 158
hpaulj Avatar answered Oct 06 '22 20:10

hpaulj