Pandas: Selecting rows for which groupby.sum() satisfies condition

Question

In pandas I have a dataframe of the form:

>>> import pandas as pd  
>>> df = pd.DataFrame({'ID':[51,51,51,24,24,24,31], 'x':[0,1,0,0,1,1,0]})
>>> df

ID   x
51   0
51   1
51   0
24   0
24   1
24   1
31   0

For every 'ID' the value of 'x' is recorded several times, it is either 0 or 1. I want to select those rows from df that contain an 'ID' for which 'x' is 1 at least twice.

For every 'ID' I manage to count the number of times 'x' is 1, by

>>> df.groupby('ID')['x'].sum()

ID
51    1
24    2
31    0

But I don't know how to proceed from here. I would like the following output:

Scott Boston · Accepted Answer

Use groupby and filter

df.groupby('ID').filter(lambda s: s.x.sum()>=2)

Output:

BENY · Answer

df = pd.DataFrame({'ID':[51,51,51,24,24,24,31], 'x':[0,1,0,0,1,1,0]})
df.loc[df.groupby(['ID'])['x'].transform(func=sum)>=2,:]
out:
   ID  x
3  24  0
4  24  1
5  24  1

Pandas: Selecting rows for which groupby.sum() satisfies condition

Tags:

python

pandas

pandas-groupby

DominikS

2 Answers

Scott Boston

BENY

Recent Activity

Donate For Us

Pandas: Selecting rows for which groupby.sum() satisfies condition

Tags:

python

pandas

pandas-groupby

DominikS

2 Answers

Scott Boston

BENY

Related questions

Recent Activity

Donate For Us