I have created this pandas dataframe:
d = {'Char1': [-3,2,0], 'Char2': [0,1,2], 'Char3': [-1,0,-1]}
df = pd.DataFrame(data=d)
print(df)
which looks like this:

I need to create two additional fields:
This is how Factor1 and Factor2 should be populated across each record:
So, the resulting dataset should look like this:

So, let's take a look at the first record:
And so on.
How can I do this in Python/Pandas?
An efficient method is to use the underlying numpy array with argsort:
import numpy as np
df[['Factor1', 'Factor2']] = df.columns.to_numpy()[np.argsort(df.to_numpy())[:, :2]]
output:
Char1 Char2 Char3 Factor1 Factor2
0 -3 0 -1 Char1 Char3
1 2 1 0 Char3 Char2
2 0 2 -1 Char3 Char1
import numpy as np
N = 2
order = np.argsort(df.to_numpy())[:, :N]
df[[f'Factor{i+1}' for i in range(N)]] = df.columns.to_numpy()[order]
example for N=3:
Char1 Char2 Char3 Factor1 Factor2 Factor3
0 -3 0 -1 Char1 Char3 Char2
1 2 1 0 Char3 Char2 Char1
2 0 2 -1 Char3 Char1 Char2
You can do idxmin and in order to get 2nd small we can mask the min
out = df.assign( **{'factor1' : df.idxmin(1),
'factor2' : df.mask(df.eq(df.min(1),axis=0)).idxmin(1)})
Out[28]:
Char1 Char2 Char3 factor1 factor2
0 -3 0 -1 Char1 Char3
1 2 1 0 Char3 Char2
2 0 2 -1 Char3 Char1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With