Sample dataframe as below
df = pd.DataFrame({'ID': ['a', 'a', 'a', 'b', 'b', 'c', 'c'], 
                   'color': ['red', 'blue', 'green', 'red', 'blue', 'red', 'green']})
I want 2 columns with all combinations of the color field after grouping by ID.
I want the resultant dataframe as shown below
| ID | color1 | color2 | 
|---|---|---|
| a | red | blue | 
| a | red | green | 
| a | blue | red | 
| a | blue | green | 
| a | green | red | 
| a | green | blue | 
| b | red | blue | 
| b | blue | red | 
| c | red | green | 
| c | green | red | 
I have tried using itertools.permutations but am looking for something more direct or for a solution that utilizes Pandas more.
I think you can do a self merge and query:
df.merge(df, on='ID', suffixes=[1,2]).query('color1 != color2')
Or similar, merge then filter:
(df.merge(df, on='ID', suffixes=[1,2])
   .loc[lambda x: x['color1'] != x['color2']]
)
Output:
   ID color1 color2
1   a    red   blue
2   a    red  green
3   a   blue    red
5   a   blue  green
6   a  green    red
7   a  green   blue
10  b    red   blue
11  b   blue    red
14  c    red  green
15  c  green    red
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With