Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the indices of the rows where there are non-zero entries in a sparse csc_matrix

I have a numpy array, X:

type(X)
>>> <class 'scipy.sparse.csc.csc_matrix'>

I am interested in finding the indices of the rows where there are non-zero entries, in the 0th column. I tried:

getcol =  X.getcol(0)
print getcol

which gives me:

(0, 0)  1
(2, 0)  1
(5, 0)  10

This is great, but what I want is a vector that has 0, 2, 5 in it.

How do I get the indices I'm looking for?

Thanks for the help.

like image 923
tumultous_rooster Avatar asked Oct 21 '22 10:10

tumultous_rooster


1 Answers

With a CSC matrix you can do the following:

>>> import scipy.sparse as sps
>>> a = np.array([[1, 0, 0],
...               [0, 1, 0],
...               [1, 0, 1],
...               [0, 0, 1],
...               [0, 1, 0],
...               [1, 0, 1]])
>>> aa = sps.csc_matrix(a)
>>> aa.indices[aa.indptr[0]:aa.indptr[1]]
array([0, 2, 5])
>>> aa.indices[aa.indptr[1]:aa.indptr[2]]
array([1, 4])
>>> aa.indices[aa.indptr[2]:aa.indptr[3]]
array([2, 3, 5])

So aa.indices[aa.indptr[col]:aa.indptr[col+1]] should get you what you are after.

like image 82
Jaime Avatar answered Oct 23 '22 02:10

Jaime