Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does csr_matrix.sort_indices do?

I make a csr_matrix in the following way:

>>> A = sparse.csr_matrix([[0, 1, 0],
                           [1, 0, 1],
                           [0, 1, 0]])
>>> A[2,:] = np.array([-1, -2, -3])

>>> A.indptr
Out[12]: array([0, 1, 3, 6], dtype=int32)
>>> A.indices
Out[13]: array([1, 0, 2, 0, 2, 1], dtype=int32)
>>> A.data
Out[14]: array([ 1,  1,  1, -1, -3, -2], dtype=int64)

Now I want to interchange the last two elements in the indices and data arrays, so I try:

>>> A.sort_indices()

This does not do anything to my matrix however. The manual for this function only states that it sorts the indices.

  1. What does this function do? In which condition can you see a difference?
  2. How can I sort the indices and data arrays, such that for each row the indices are sorted?
like image 593
physicalattraction Avatar asked Oct 31 '22 10:10

physicalattraction


1 Answers

As stated in the document, A.sort_indices() sorts the indices in-place. But there is a cache: if A.has_sorted_indices is True, it won't do anything (the cache was introduced at 0.7.0).

So, in order to see a difference, you need to manually set A.has_sorted_indices to False.

>>> A.has_sorted_indices, A.indices
(True, array([1, 0, 2, 0, 2, 1], dtype=int32))
>>> A.sort_indices()
>>> A.has_sorted_indices, A.indices
(True, array([1, 0, 2, 0, 2, 1], dtype=int32))
>>> A.has_sorted_indices = False
>>> A.sort_indices()
>>> A.has_sorted_indices, A.indices
(True, array([1, 0, 2, 0, 1, 2], dtype=int32))

Note that, unlike what OP has indicated, as of SciPy 0.19.0 running A[2, :] = [-1, -2, -3] no longer produces an out-of-order index (this should have been fixed in 0.14.0). On the other hand, this operation produces a warning:

SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.

Anyway, we could easily produce out-of-order index in other ways, e.g. by matrix multiplication:

>>> B = scipy.sparse.csr_matrix([[0, 1, 0], [1, 0, 1], [0, 1, 0]])
>>> C = B*B
>>> C.has_sorted_indices, C.indices
(0, array([2, 0, 1, 2, 0], dtype=int32))
>>> C.sort_indices()
>>> C.has_sorted_indices, C.indices
(True, array([0, 2, 1, 0, 2], dtype=int32))
like image 56
kennytm Avatar answered Nov 15 '22 04:11

kennytm