My df contains many columns. I want to replace all values only in columns A and B with NaN according to a condition. Also, I want to apply the same condition to another df except on columns C and D. My search so far returns methods that work for a single column.
My attempt so far.
Only on columns A and B:
df[df.loc[:, df.columns['A','B']] < (0.1 * 500)] = np.nan
Except columns A and B:
df[df.loc[:, df.columns != ['A','B']] < (0.1 * 500)] = np.nan
These code doesn't work.
I think you need DataFrame.mask:
df = pd.DataFrame({
'A':[4,5,4,5,5,4],
'B':[7,8,9,4,2,3],
'C':[1,3,5,7,1,0],
'D':[5,3,6,9,2,4],
}) * 10
c = ['A','B']
df[c] = df[c].mask(df[c] < (0.1 * 500))
print (df)
A B C D
0 NaN 70.0 10 50
1 50.0 80.0 30 30
2 NaN 90.0 50 60
3 50.0 NaN 70 90
4 50.0 NaN 10 20
5 NaN NaN 0 40
c1 = df.columns.difference(c)
df[c1] = df[c1].mask(df[c1] < (0.1 * 500))
print (df)
A B C D
0 NaN 70.0 NaN 50.0
1 50.0 80.0 NaN NaN
2 NaN 90.0 50.0 60.0
3 50.0 NaN 70.0 90.0
4 50.0 NaN NaN NaN
5 NaN NaN NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With