Is there an optimal way to get all combinations of values in a grouped pandas dataframe?

Question

Sample dataframe as below

df = pd.DataFrame({'ID': ['a', 'a', 'a', 'b', 'b', 'c', 'c'], 
                   'color': ['red', 'blue', 'green', 'red', 'blue', 'red', 'green']})

I want 2 columns with all combinations of the color field after grouping by ID.
I want the resultant dataframe as shown below

ID	color1	color2
a	red	blue
a	red	green
a	blue	red
a	blue	green
a	green	red
a	green	blue
b	red	blue
b	blue	red
c	red	green
c	green	red

I have tried using itertools.permutations but am looking for something more direct or for a solution that utilizes Pandas more.

Quang Hoang · Accepted Answer

I think you can do a self merge and query:

df.merge(df, on='ID', suffixes=[1,2]).query('color1 != color2')

Or similar, merge then filter:

(df.merge(df, on='ID', suffixes=[1,2])
   .loc[lambda x: x['color1'] != x['color2']]
)

Output:

   ID color1 color2
1   a    red   blue
2   a    red  green
3   a   blue    red
5   a   blue  green
6   a  green    red
7   a  green   blue
10  b    red   blue
11  b   blue    red
14  c    red  green
15  c  green    red

Is there an optimal way to get all combinations of values in a grouped pandas dataframe?

Tags:

python

pandas

dataframe

pandas-groupby

LeCoconutWhisperer

1 Answers

Quang Hoang

Recent Activity

Donate For Us

Is there an optimal way to get all combinations of values in a grouped pandas dataframe?

Tags:

python

pandas

dataframe

pandas-groupby

LeCoconutWhisperer

1 Answers

Quang Hoang

Related questions

Recent Activity

Donate For Us