Keep common rows within every group of a pandas dataframe

Question

Given the following pandas data frame:

  | a  b
--+-----
0 | 1  A 
1 | 2  A 
2 | 3  A 
3 | 4  A 
4 | 1  B 
5 | 2  B 
6 | 3  B 
7 | 1  C 
8 | 3  C 
9 | 4  C

If you group it by column b I want to perform an action that keeps only the rows where they have column a in common. The result would be the following data frame:

  | a  b
--+-----
0 | 1  A 
2 | 3  A 
4 | 1  B 
6 | 3  B 
7 | 1  C 
8 | 3  C

Is there some built in method to do this?

anky · Accepted Answer

You can try pivot_table with dropna here then filter using sreries.isin :

s = df.pivot_table(index='a',columns='b',aggfunc=len).dropna().index
df[df['a'].isin(s)]

Similarly with crosstab:

s = pd.crosstab(df['a'],df['b'])
df[df['a'].isin(s[s.all(axis=1)].index)]

Grzegorz Skibinski · Answer

Try:

df2=df.groupby("a")["b"].nunique().eq(len(df.groupby("b").groups))
df=df.merge(df2.loc[df2].rename("filter"), on="a").drop("filter", axis=1)

Output:

Keep common rows within every group of a pandas dataframe

Tags:

python

pandas

rindis

2 Answers

anky

Grzegorz Skibinski

Recent Activity

Donate For Us

Keep common rows within every group of a pandas dataframe

Tags:

python

pandas

rindis

2 Answers

anky

Grzegorz Skibinski

Related questions

Recent Activity

Donate For Us