I need to iterate over a list and perform a specific operation if the value from the list exists in one of the pandas dataframe column. I tried to do as below, but getting below error
'Error: #The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().'
import pandas as pd
people = {
'fname':['Alex','Jane','John'],
'age':[20,15,25],
'sal':[100,200,300]
}
df=pd.DataFrame(people)
check_list=['Alex','John']
for column in check_list:
if (column == df['fname']):
df['new_column']=df['sal']/df['age']
else:
df['new_column']=df['sal']
df
Required output:
fname age sal new_column
Alex 20 100 5 <<-- sal/age
Jane 15 200 200 <<-- sal as it is
John 25 300 12 <<-- sal/age
use np.where
with .isin
to check if a column contains particular values.
df['new_column'] = np.where(
df['fname'].isin(['Alex','John']),
df['sal']/df['age'],
df['sal']
)
print(df)
fname age sal new_column
0 Alex 20 100 5.0
1 Jane 15 200 200.0
2 John 25 300 12.0
pure pandas version.
df['new_column'] = (df['sal']/df['age']).where(
df['fname'].isin(['Alex','John']),other=df['sal'])
print(df)
fname age sal new_col
0 Alex 20 100 5.0
1 Jane 15 200 200.0
2 John 25 300 12.0
Try using df.apply
import pandas as pd
people = {
'fname':['Alex','Jane','John'],
'age':[20,15,25],
'sal':[100,200,300]
}
df=pd.DataFrame(people)
def checker(item):
check_list=['Alex','John']
if item["fname"] in check_list:
return item['sal']/item['age']
else:
return item['sal']
df["Exists"] = df.apply(checker, axis=1)
df
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With