Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

create new column that telling the values why they are not selected on pandas?

Tags:

pandas

Input

df=pd.DataFrame({'Name':['JOHN','ALLEN','BOB','NIKI','CHARLIE','CHANG'],
              'Age':[35,42,63,29,47,51],
              'Salary_in_1000':[100,93,78,120,64,115],
             'FT_Team':['STEELERS','SEAHAWKS','FALCONS','FALCONS','PATRIOTS','STEELERS']})


n1=(df['Age']< 60)
n2=(df['Salary_in_1000']>=100) 
n3=(df['FT_Team'].str.startswith('S'))

Using these conditions to select, it will return JOHN and CHANG.

Goal

I want to create dataframe where data is not selected and a new column which returns which conditions is not expected. For example,

* ALLEN: n1, n2
* BOB: n2,n3
* NIKI: n3
* CHANG: n2,n3

The new column name is reason. The value is the condition variable and the type is string.

Try

I have to try each condition and record each variable violates which rules by hand.

like image 386
Jack Avatar asked Oct 14 '22 21:10

Jack


1 Answers

create a new dataframe then use .dot matrix on the boolean values and the column names.

s = pd.DataFrame({'n1' : n1, 'n2' : n2, 'n3' : n3})

df['reason'] = s.eq(False).dot(s.columns +',').str.rstrip(',')

print(df)
      Name  Age  Salary_in_1000   FT_Team    reason
0     JOHN   35             100  STEELERS          
1    ALLEN   42              93  SEAHAWKS        n2
2      BOB   63              78   FALCONS  n1,n2,n3
3     NIKI   29             120   FALCONS        n3
4  CHARLIE   47              64  PATRIOTS     n2,n3
5    CHANG   51             115  STEELERS       

like image 155
Umar.H Avatar answered Oct 19 '22 21:10

Umar.H