Pandas conditional groupby

Question

I have a dataframe like below:

df = pd.DataFrame({'col_1': [6ai,6aii,6aii,6b],
               'col_2': [1,1,5,1],
               'col_3':[True,False,True,False]})

   col_1  col_2 col_3
0    6a1      1    True
1    6aii     1    False
2    6aii     5    True
3    6b       1    False

I want to group this dataframe on col_1, and then only select the row where col_3 is True. In cases where I have only one occurrence of a value in col_1, I want to select the row regardless col_3 is True or False. So the result I'm after is:

   col_1  col_2 col_3
0    6a1      1    True
2    6aii     5    True
3    6b       1    False

I'm thinking that I should use groupby, but I'm not sure. I could really use some help please?

BENY · Accepted Answer

Here is one way

df[df.col_3|~df.col_1.duplicated(keep=False)]
Out[344]: 
  col_1  col_2  col_3
0   6a1      1   True
2  6aii      5   True
3    6b      1  False

Quang Hoang · Answer

You can use groupby().transform('count') to find those occur exactly once:

df[df['col_3'] | df.groupby('col_1')['col_3'].transform('count').eq(1)]

Output:

  col_1  col_2  col_3
0   6ai      1   True
2  6aii      5   True
3    6b      1  False

Pandas conditional groupby

Tags:

python

pandas

dataframe

conditional-statements

selection

Robin Kuijs

2 Answers

BENY

Quang Hoang

Recent Activity

Donate For Us

Pandas conditional groupby

Tags:

python

pandas

dataframe

conditional-statements

selection

Robin Kuijs

2 Answers

BENY

Quang Hoang

Related questions

Recent Activity

Donate For Us