a user-user similarity matrix that some rows have duplicated value and NaN 
userId  316       320       359       370       910
userId                                             
316     1.0  0.500000  0.500000  0.500000       NaN
320     0.5  1.000000  0.242837  0.019035  0.031737
359     0.5  0.242837  1.000000  0.357620  0.175914
370     0.5  0.019035  0.357620  1.000000  0.317371
910     NaN  0.031737  0.175914  0.317371  1.000000
I want rank the simirity for each row distinctly. Like so:
userId  316  320  359  370  910
userId                         
316       1    2    3    4   NaN
320       2    1    3    5    1
359       2    4    1    3    5
370       2    5    3    1    4
910      NaN   4    3    2    1
The rank between the same value is not important. But it needs to be a distinct value. And NaNmust be keeped.
I tried df.rank(ascending =False,axis = 1) (doc), which failed to give me a distinct value of rank.
I also tried scipy.stats.rankdata (doc), but it can't keep NaN.  
Use rank with method='first'
df.rank(1, ascending=False, method='first')
     316  320  359  370  910
316  1.0  2.0  3.0  4.0  NaN
320  2.0  1.0  3.0  5.0  4.0
359  2.0  4.0  1.0  3.0  5.0
370  2.0  5.0  3.0  1.0  4.0
910  NaN  4.0  3.0  2.0  1.0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With