Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python pandas "groupby" and "if any" condition

Tags:

python

pandas

I have a dataframe similar to the one below, and I would like to create a new variable which contains true/false if for each project the sector "a" has been covered at least once. I'm trying with the group.by() function, and wanted to use the .transform() method but since my data is text, I don't know how to use it.

      project    sector  
    
        01         a    
        01         b    
        02         b     
        02         b     
        03         a     
        03         a     
    



 project    sector   new_col

    01         a     true
    01         b     true
    02         b     false
    02         b     false
    03         a     true
    03         a     true

       
like image 499
LBedo Avatar asked Dec 15 '21 12:12

LBedo


2 Answers

You could try the following:

df['new_col'] = df.groupby('project')['sector'].transform(lambda x: (x == 'a').any() )

This will group by project and check if any 'a' is in the groups sectors

like image 172
Phil Leh Avatar answered Sep 28 '22 15:09

Phil Leh


It's not the fastest option, but definitely should work.

new_col = your_db.groupby(['project'])['sector'].unique().apply(lambda x: 'a' in x).rename('new_col')
your_db = your_db.merge(new_col, how = 'inner', left_on = 'project', right_on = 'project')
like image 30
Rafa Avatar answered Sep 28 '22 13:09

Rafa