I want to remove diagonal elements from a sparse matrix. Since the matrix is sparse, these elements shouldn't be stored once removed.
Scipy provides a method to set diagonal elements values: setdiag
If I try it using lil_matrix, it works:
>>> a = np.ones((2,2))
>>> c = lil_matrix(a)
>>> c.setdiag(0)
>>> c
<2x2 sparse matrix of type '<type 'numpy.float64'>'
with 2 stored elements in LInked List format>
However with csr_matrix, it seems diagonal elements are not removed from storage:
>>> b = csr_matrix(a)
>>> b
<2x2 sparse matrix of type '<type 'numpy.float64'>'
with 4 stored elements in Compressed Sparse Row format>
>>> b.setdiag(0)
>>> b
<2x2 sparse matrix of type '<type 'numpy.float64'>'
with 4 stored elements in Compressed Sparse Row format>
>>> b.toarray()
array([[ 0., 1.],
[ 1., 0.]])
Through a dense array, we have of course:
>>> csr_matrix(b.toarray())
<2x2 sparse matrix of type '<type 'numpy.float64'>'
with 2 stored elements in Compressed Sparse Row format>
Is that intended? If so, is it due to the compressed format of csr matrices? Is there any workaround else than going from sparse to dense to sparse again?
D = diag( v ) returns a square diagonal matrix with the elements of vector v on the main diagonal. D = diag( v , k ) places the elements of vector v on the k th diagonal. k=0 represents the main diagonal, k>0 is above the main diagonal, and k<0 is below the main diagonal.
Diagonals containing only zero are omitted. For sparse matrices with very few non-zero diagonals, such as diagonal or tridiagonal matrices, the DIA format allows for very quick arithmetic operations. Its main limitation is that looking up each matrix element requires performing a blind search through the offsets array.
The function csr_matrix() is used to create a sparse matrix of compressed sparse row format whereas csc_matrix() is used to create a sparse matrix of compressed sparse column format.
The tridiagonal regular sparse matrix where all non-zero elements lie on one of the three diagonals, the main diagonal above and below. Storing Tri-diagonal regular sparse matrices. In a tri-diagonal regular sparse matrix, all the non-zero elements are stored in a 1-dimensional array row by row.
Simply setting elements to 0 does not change the sparsity of a csr
matrix. You have to apply eliminate_zeros
.
In [807]: a=sparse.csr_matrix(np.ones((2,2)))
In [808]: a
Out[808]:
<2x2 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in Compressed Sparse Row format>
In [809]: a.setdiag(0)
In [810]: a
Out[810]:
<2x2 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in Compressed Sparse Row format>
In [811]: a.eliminate_zeros()
In [812]: a
Out[812]:
<2x2 sparse matrix of type '<class 'numpy.float64'>'
with 2 stored elements in Compressed Sparse Row format>
Since changing the sparsity of a csr matrix is relatively expensive, they let you change values to 0 without changing sparsity.
In [829]: %%timeit a=sparse.csr_matrix(np.ones((1000,1000)))
...: a.setdiag(0)
100 loops, best of 3: 3.86 ms per loop
In [830]: %%timeit a=sparse.csr_matrix(np.ones((1000,1000)))
...: a.setdiag(0)
...: a.eliminate_zeros()
SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
10 loops, best of 3: 133 ms per loop
In [831]: %%timeit a=sparse.lil_matrix(np.ones((1000,1000)))
...: a.setdiag(0)
100 loops, best of 3: 14.1 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With