Given the following pandas data frame:
| a b
--+-----
0 | 1 A
1 | 2 A
2 | 3 A
3 | 4 A
4 | 1 B
5 | 2 B
6 | 3 B
7 | 1 C
8 | 3 C
9 | 4 C
If you group it by column b
I want to perform an action that keeps only the rows where they have column a
in common. The result would be the following data frame:
| a b
--+-----
0 | 1 A
2 | 3 A
4 | 1 B
6 | 3 B
7 | 1 C
8 | 3 C
Is there some built in method to do this?
You can try pivot_table
with dropna
here then filter using sreries.isin
:
s = df.pivot_table(index='a',columns='b',aggfunc=len).dropna().index
df[df['a'].isin(s)]
Similarly with crosstab
:
s = pd.crosstab(df['a'],df['b'])
df[df['a'].isin(s[s.all(axis=1)].index)]
a b
0 1 A
2 3 A
4 1 B
6 3 B
7 1 C
8 3 C
Try:
df2=df.groupby("a")["b"].nunique().eq(len(df.groupby("b").groups))
df=df.merge(df2.loc[df2].rename("filter"), on="a").drop("filter", axis=1)
Output:
a b
0 1 A
1 1 B
2 1 C
3 3 A
4 3 B
5 3 C
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With