I have a dataframe similar to the one below, and I would like to create a new variable which contains true/false if for each project the sector "a" has been covered at least once. I'm trying with the group.by() function, and wanted to use the .transform() method but since my data is text, I don't know how to use it.
project sector
01 a
01 b
02 b
02 b
03 a
03 a
project sector new_col
01 a true
01 b true
02 b false
02 b false
03 a true
03 a true
You could try the following:
df['new_col'] = df.groupby('project')['sector'].transform(lambda x: (x == 'a').any() )
This will group by project and check if any 'a' is in the groups sectors
It's not the fastest option, but definitely should work.
new_col = your_db.groupby(['project'])['sector'].unique().apply(lambda x: 'a' in x).rename('new_col')
your_db = your_db.merge(new_col, how = 'inner', left_on = 'project', right_on = 'project')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With