I need an efficient way to row standardize a sparse matrix.
Given
W = matrix([[0, 1, 0, 1, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 1, 0, 0, 0],
[1, 0, 0, 0, 1, 0, 1, 0, 0],
[0, 1, 0, 1, 0, 1, 0, 1, 0],
[0, 0, 1, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 1, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 1, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 1, 0, 1, 0]])
row_sums = W.sum(1)
I need to produce...
W2 = matrix([[0. , 0.5 , 0. , 0.5 , 0. , 0. , 0. , 0. , 0. ],
[0.33, 0. , 0.33, 0. , 0.33, 0. , 0. , 0. , 0. ],
[0. , 0.5 , 0. , 0. , 0. , 0.5 , 0. , 0. , 0. ],
[0.33, 0. , 0. , 0. , 0.33, 0. , 0.33, 0. , 0. ],
[0. , 0.25, 0. , 0.25, 0. , 0.25, 0. , 0.25, 0. ],
[0. , 0. , 0.33, 0. , 0.33, 0. , 0. , 0. , 0.33],
[0. , 0. , 0. , 0.5 , 0. , 0. , 0. , 0.5 , 0. ],
[0. , 0. , 0. , 0. , 0.33, 0. , 0.33, 0. , 0.33],
[0. , 0. , 0. , 0. , 0. , 0.5 , 0. , 0.5 , 0. ]])
Where,
for i in range(9):
W2[i] = W[i]/row_sums[i]
I'd like to find a way to do this without loops (i.e. Vectorized) and using Scipy.sparse matrices. W could be as large at 10mil x 10mil.
with a bit of matrix algebra
>>> cc
<9x9 sparse matrix of type '<type 'numpy.int32'>'
with 24 stored elements in Compressed Sparse Row format>
>>> ccd = sparse.spdiags(1./cc.sum(1).T, 0, *cc.shape)
>>> ccn = ccd * cc
>>> np.round(ccn.todense(), 2)
array([[ 0. , 0.5 , 0. , 0.5 , 0. , 0. , 0. , 0. , 0. ],
[ 0.33, 0. , 0.33, 0. , 0.33, 0. , 0. , 0. , 0. ],
[ 0. , 0.5 , 0. , 0. , 0. , 0.5 , 0. , 0. , 0. ],
[ 0.33, 0. , 0. , 0. , 0.33, 0. , 0.33, 0. , 0. ],
[ 0. , 0.25, 0. , 0.25, 0. , 0.25, 0. , 0.25, 0. ],
[ 0. , 0. , 0.33, 0. , 0.33, 0. , 0. , 0. , 0.33],
[ 0. , 0. , 0. , 0.5 , 0. , 0. , 0. , 0.5 , 0. ],
[ 0. , 0. , 0. , 0. , 0.33, 0. , 0.33, 0. , 0.33],
[ 0. , 0. , 0. , 0. , 0. , 0.5 , 0. , 0.5 , 0. ]])
>>> ccn
<9x9 sparse matrix of type '<type 'numpy.float64'>'
with 24 stored elements in Compressed Sparse Row format>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With