I need to apply a single condition to 3 columns of a dataframe and change value of 4th without using or statement .
I can do with np.where but if the no of columns is big it's going to take a lot of time
import pandas as pd
import numpy as np
df = pd.DataFrame({'a':[1,2,3,4],'b':[1,3,6,7],'c':[4,6,4,1], 'd':['p','f','p','u'],'e':['a','a','b','c']})
df['d'] = np.where(df.a > 4 | df.b > 4 | df.c > 4 , 'p',df['d'])
df = pd.DataFrame({'a':[1,2,3,4,5],'b':[1,3,6,7],'c':[4,6,4,1], 'd':['p','f','p','f']})
df['d']=np.where(df.a > 4 | df.b > 4 | df.c > 4 , 'p','f')
I need someway of implementing same condition > , < to list of columns without using or for each.
Use DataFrame.gt along with np.where:
import numpy as np
import pandas as pd
df = pd.DataFrame({'a':[1,2,3,4],'b':[1,3,6,7],'c':[4,6,4,1], 'd':['p','f','p','u'],'e':['a','a','b','c']})
# create a subset of a dataframe on which you want to check condition
new_df = df[['a','b','c']]
mask = new_df.gt(4).any(axis=1) # check if any value is greater than 4
df['d'] = np.where(mask, 'p','f')
print(df)
Output:
a b c d e
0 1 1 4 f a
1 2 3 6 p a
2 3 6 4 p b
3 4 7 1 p c
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With