I have a dataframe like below:
df = pd.DataFrame({'col_1': [6ai,6aii,6aii,6b],
'col_2': [1,1,5,1],
'col_3':[True,False,True,False]})
col_1 col_2 col_3
0 6a1 1 True
1 6aii 1 False
2 6aii 5 True
3 6b 1 False
I want to group this dataframe on col_1, and then only select the row where col_3 is True. In cases where I have only one occurrence of a value in col_1, I want to select the row regardless col_3 is True or False. So the result I'm after is:
col_1 col_2 col_3
0 6a1 1 True
2 6aii 5 True
3 6b 1 False
I'm thinking that I should use groupby, but I'm not sure. I could really use some help please?
Here is one way
df[df.col_3|~df.col_1.duplicated(keep=False)]
Out[344]:
col_1 col_2 col_3
0 6a1 1 True
2 6aii 5 True
3 6b 1 False
You can use groupby().transform('count') to find those occur exactly once:
df[df['col_3'] | df.groupby('col_1')['col_3'].transform('count').eq(1)]
Output:
col_1 col_2 col_3
0 6ai 1 True
2 6aii 5 True
3 6b 1 False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With