Input
df=pd.DataFrame({'Name':['JOHN','ALLEN','BOB','NIKI','CHARLIE','CHANG'],
'Age':[35,42,63,29,47,51],
'Salary_in_1000':[100,93,78,120,64,115],
'FT_Team':['STEELERS','SEAHAWKS','FALCONS','FALCONS','PATRIOTS','STEELERS']})
n1=(df['Age']< 60)
n2=(df['Salary_in_1000']>=100)
n3=(df['FT_Team'].str.startswith('S'))
Using these conditions to select, it will return JOHN and CHANG.
Goal
I want to create dataframe where data is not selected and a new column which returns which conditions is not expected. For example,
* ALLEN: n1, n2
* BOB: n2,n3
* NIKI: n3
* CHANG: n2,n3
The new column name is reason
. The value is the condition variable and the type is string.
Try
I have to try each condition and record each variable violates which rules by hand.
create a new dataframe then use .dot
matrix on the boolean values and the column names.
s = pd.DataFrame({'n1' : n1, 'n2' : n2, 'n3' : n3})
df['reason'] = s.eq(False).dot(s.columns +',').str.rstrip(',')
print(df)
Name Age Salary_in_1000 FT_Team reason
0 JOHN 35 100 STEELERS
1 ALLEN 42 93 SEAHAWKS n2
2 BOB 63 78 FALCONS n1,n2,n3
3 NIKI 29 120 FALCONS n3
4 CHARLIE 47 64 PATRIOTS n2,n3
5 CHANG 51 115 STEELERS
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With