Sparse matrix: how to get nonzero indices for each row

Tags:

I have an scipy CSR matrix and i want to get element column indices for each row. My approach is:

import scipy.sparse as sp
N = 100
d = 0.1
M = sp.rand(N, N, d, format='csr')

indM = [row.nonzero()[1] for row in M]

indM is what i need, it has the same number of row as M and looks like this:

[array([ 6,  7, 11, ..., 79, 85, 86]),
 array([12, 20, 25, ..., 84, 93, 95]),
...
 array([ 7, 24, 32, 40, 50, 51, 57, 71, 74, 96]),
 array([ 1,  4,  9, ..., 71, 95, 96])]

The problem is that with big matrices this approach looks slow. Is there any way to avoid list comprehension or somehow speed this up?

Thank you.

724

asked Jun 14 '17 05:06

Alexey Trofimov

1 Answers

You can simply use the indices and indptr attributes directly:

import numpy
import scipy.sparse

N = 5
d = 0.3
M = scipy.sparse.rand(N, N, d, format='csr')
M.toarray()
# array([[ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
#        [ 0.        ,  0.        ,  0.        ,  0.        ,  0.30404632],
#        [ 0.63503713,  0.        ,  0.        ,  0.        ,  0.        ],
#        [ 0.68865311,  0.81492098,  0.        ,  0.        ,  0.        ],
#        [ 0.08984168,  0.87730292,  0.        ,  0.        ,  0.18609702]])

M.indices
# array([1, 2, 4, 3, 0, 1, 4], dtype=int32)
M.indptr
# array([0, 3, 4, 6, 6, 7], dtype=int32)

numpy.split(M.indices, M.indptr)[1:-1]
# [array([], dtype=int32),
#  array([4], dtype=int32),
#  array([0], dtype=int32),
#  array([0, 1], dtype=int32),
#  array([0, 1, 4], dtype=int32)]

117

answered Oct 08 '22 09:10

Nils Werner

Related questions
                            
                                How to install graphviz in Ubuntu 15 to plot a decision tree for XGBoost?
                            
                                Index JSON files in elasticsearch using Python?
                            
                                Python Gevent Pywsgi server with ssl
                            
                                How to wait for RxPy parallel threads to complete
                            
                                Apply migrations and models from all the apps
                            
                                Apply seaborn heatmap columnwise on pandas dataframe
                            
                                Calculate histograms along axis
                            
                                How to shuffle groups of rows of a Pandas dataframe?
                            
                                Installing a python package that is not available in anaconda (smtplib)
                            
                                How do I get a per mille sign in my axis title using Latex in matplotlib?
                            
                                Text to Binary in Python
                            
                                How to check if there's any odd/even numbers in an Iterable (e.g. list/tuple)?
                            
                                How to Install/add jdk 7 in Docker Container
                            
                                speed up pandas apply or using map
                            
                                What is the most efficient way to compute a Kronecker Product in TensorFlow?
                            
                                pandas dataframe index match
                            
                                Collapsing rows in a Pandas dataframe if all rows have only one value in their columns
                            
                                pandas rolling apply to allow nan
                            
                                Does tf.nn.l2_loss and tf.contrib.layers.l2_regularizer serve the same purpose of adding L2 regularization in tensorflow?
                            
                                ImportError: cannot import name mpl (from matplotlib import mpl)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sparse matrix: how to get nonzero indices for each row

Tags:

python

numpy

scipy

csr

sparse-matrix

Alexey Trofimov

People also ask

1 Answers

Nils Werner

Recent Activity

Donate For Us