Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently Row Standardize a Matrix

I need an efficient way to row standardize a sparse matrix.

Given

W = matrix([[0, 1, 0, 1, 0, 0, 0, 0, 0],
            [1, 0, 1, 0, 1, 0, 0, 0, 0],
            [0, 1, 0, 0, 0, 1, 0, 0, 0],
            [1, 0, 0, 0, 1, 0, 1, 0, 0],
            [0, 1, 0, 1, 0, 1, 0, 1, 0],
            [0, 0, 1, 0, 1, 0, 0, 0, 1],
            [0, 0, 0, 1, 0, 0, 0, 1, 0],
            [0, 0, 0, 0, 1, 0, 1, 0, 1],
            [0, 0, 0, 0, 0, 1, 0, 1, 0]])
row_sums = W.sum(1)

I need to produce...

W2 = matrix([[0.  , 0.5 , 0.  , 0.5 , 0.  , 0.  , 0.  , 0.  , 0.  ],
             [0.33, 0.  , 0.33, 0.  , 0.33, 0.  , 0.  , 0.  , 0.  ],
             [0.  , 0.5 , 0.  , 0.  , 0.  , 0.5 , 0.  , 0.  , 0.  ],
             [0.33, 0.  , 0.  , 0.  , 0.33, 0.  , 0.33, 0.  , 0.  ],
             [0.  , 0.25, 0.  , 0.25, 0.  , 0.25, 0.  , 0.25, 0.  ],
             [0.  , 0.  , 0.33, 0.  , 0.33, 0.  , 0.  , 0.  , 0.33],
             [0.  , 0.  , 0.  , 0.5 , 0.  , 0.  , 0.  , 0.5 , 0.  ],
             [0.  , 0.  , 0.  , 0.  , 0.33, 0.  , 0.33, 0.  , 0.33],
             [0.  , 0.  , 0.  , 0.  , 0.  , 0.5 , 0.  , 0.5 , 0.  ]]) 

Where,

for i in range(9):
    W2[i] = W[i]/row_sums[i]

I'd like to find a way to do this without loops (i.e. Vectorized) and using Scipy.sparse matrices. W could be as large at 10mil x 10mil.

like image 808
Charles Avatar asked Dec 02 '11 15:12

Charles


1 Answers

with a bit of matrix algebra

>>> cc
<9x9 sparse matrix of type '<type 'numpy.int32'>'
    with 24 stored elements in Compressed Sparse Row format>
>>> ccd = sparse.spdiags(1./cc.sum(1).T, 0, *cc.shape)
>>> ccn = ccd * cc
>>> np.round(ccn.todense(), 2)
array([[ 0.  ,  0.5 ,  0.  ,  0.5 ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.33,  0.  ,  0.33,  0.  ,  0.33,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.5 ,  0.  ,  0.  ,  0.  ,  0.5 ,  0.  ,  0.  ,  0.  ],
       [ 0.33,  0.  ,  0.  ,  0.  ,  0.33,  0.  ,  0.33,  0.  ,  0.  ],
       [ 0.  ,  0.25,  0.  ,  0.25,  0.  ,  0.25,  0.  ,  0.25,  0.  ],
       [ 0.  ,  0.  ,  0.33,  0.  ,  0.33,  0.  ,  0.  ,  0.  ,  0.33],
       [ 0.  ,  0.  ,  0.  ,  0.5 ,  0.  ,  0.  ,  0.  ,  0.5 ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.33,  0.  ,  0.33,  0.  ,  0.33],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.5 ,  0.  ,  0.5 ,  0.  ]])
>>> ccn
<9x9 sparse matrix of type '<type 'numpy.float64'>'
    with 24 stored elements in Compressed Sparse Row format>
like image 139
Josef Avatar answered Oct 15 '22 16:10

Josef