Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: groupby before merge

Tags:

python

pandas

I have dataframe like this. But there are about ten thousand rows.

import pandas as pd
import numpy as np
data = {'gameId': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], 'eventId': [1, 2, 3, 4, 5, 1, 2, 3, 4, 5], 'player': ['A', 'B', 'C', 'D', 'E', 'A', 'B', 'C', 'D', 'E'], 'related_eventId': [2, 1, 4, 3, np.nan, 2, 1, 4, 3, np.nan]}

So I need to create column "related_player" based on the player from row which eventId is equal related_eventId.

If I would not have column gameId I can do it by merging

result = df.merge(df[['eventId', 'player']], left_on='related_eventId', right_on='eventId', how='left', suffixes=('', '_related'))
result.rename(columns={'player_related': 'related_player', 'eventId_related': 'related_eventId'}, inplace=True)
result = result[['eventId', 'player', 'related_eventId', 'related_player']]

But output is not correct because I need to group by gameId. In R it is pretty simple, but I don't understand how to correctly do it in Python.

My expected output should be like this

gameId eventId player related_eventId related_player
1 1 A 2 B
1 2 B 1 A
1 3 C 4 D
1 4 D 3 C
1 5 E NaN NaN
2 1 A 2 B
2 2 B 1 A
2 3 C 4 D
2 4 D 3 C
2 5 E NaN NaN
like image 638
Delopera Avatar asked Sep 02 '25 16:09

Delopera


1 Answers

Add column to list ['eventId', 'player','gameId'] and to parameters left_on and right_on:

result = df.merge(df[['eventId', 'player','gameId']], 
                  left_on=['gameId','related_eventId'], 
                  right_on=['gameId','eventId'], 
                  how='left', 
                  suffixes=('', '_related'))
result.rename(columns={'player_related': 'related_player',
                       'eventId_related': 'related_eventId'}, inplace=True)
result = result[['eventId', 'player', 'related_eventId', 'related_player']]
print (result)
   eventId player  related_eventId  related_eventId related_player
0        1      A              2.0              2.0              B
1        2      B              1.0              1.0              A
2        3      C              4.0              4.0              D
3        4      D              3.0              3.0              C
4        5      E              NaN              NaN            NaN
5        1      A              2.0              2.0              B
6        2      B              1.0              1.0              A
7        3      C              4.0              4.0              D
8        4      D              3.0              3.0              C
9        5      E              NaN              NaN            NaN
like image 94
jezrael Avatar answered Sep 05 '25 10:09

jezrael