I'd like to create a new column to a Pandas dataframe populated with True or False based on the other values in each specific row. My approach to solve this task was to apply a function checking boolean conditions across each row in the dataframe and populate the new column with either True or False.
This is the dataframe:
l={'DayTime':['2018-03-01','2018-03-02','2018-03-03'],'Pressure':
[9,10.5,10.5], 'Feed':[9,10.5,11], 'Temp':[9,10.5,11]}
df1=pd.DataFrame(l)
This is the function I wrote:
def ops_on(row):
return row[('Feed' > 10)
& ('Pressure' > 10)
& ('Temp' > 10)
]
The function ops_on is used to create the new column ['ops_on']:
df1['ops_on'] = df1.apply(ops_on, axis='columns')
Unfortunately, I get this error message:
TypeError: ("'>' not supported between instances of 'str' and 'int'", 'occurred at index 0')
Thankful for help.
You should work column-wise (vectorised, efficient) rather than row-wise (inefficient, Python loop):
df1['ops_on'] = (df1['Feed'] > 10) & (df1['Pressure'] > 10) & (df1['Temp'] > 10)
The &
("and") operator is applied to Boolean series element-wise. An arbitrary number of such conditions can be chained.
Alternatively, for the special case where you are performing the same comparison multiple times:
df1['ops_on'] = df1[['Feed', 'Pressure', 'Temp']].gt(10).all(1)
In your current setup, just re-write your function like this:
def ops_on(row):
return (row['Feed'] > 10) & (row['Pressure'] > 10) & (row['Temp'] > 10)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With