Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I set the max value if dataframe row to 1 and the rest of the values to 0

Original dataframe:

ix x  y  z    
0  3  4  1 
1  2  0  6
2  7  1  0
3  0  0  0

Should transform into:

ix x  y  z    
0  0  1  0 
1  0  0  1
2  1  0  0
3  0  0  0

As you can see, i'm taking the max value in each row and setting that equal to 1 then the other values in that row will be equal to 0. Also, you'll notice that row 3 stays the same since they are all equal to 0.

So, I've been able to extract the index of the max value using:

x.idxmax(axis = 1)

But i'm not sure what to do with the max indices. I'm thinking to use np.where but there isn't a conditional statement I can use. Or so I think.

Any help would be much appreciated.

like image 783
madsthaks Avatar asked Jan 28 '23 19:01

madsthaks


1 Answers

First, locate the part of the dataframe that has non-zero rows. Then find the maximal values and compare them to the matrix:

affected = (df != 0).any(axis=1)
nz = df[affected]
df[affected] = (nz.T == nz.max(axis=1)).T.astype(int)
#    x  y  z
#0   0  1  0
#1   0  0  1
#2   1  0  0
#3   0  0  0
like image 198
DYZ Avatar answered Jan 30 '23 08:01

DYZ