I have this data that contains entries of 2 players playing a game one after another then after they both gain a win or lose they get a shared score (logic of this is unimportant numbers are random anyway this is just an example to describe what I want).
So there are scores obtained for every possible outcome after player P1 and player P2 have played.
The logic of the game is not important, all I want to know is if I can create a new dataframe of all unique combinations of these 4 players playing using my initial dataframe. So calculate a new score for all possible combinations of these 4 Players if they all play and get a score together, let's say their total scores would be summed up.
Example:
Player_1 Player_2 Player_3 Player_4 Outcome_1 Outcome_2 Outcome_3 Outcome_4 Score
P1 P2 P3 P4 win win win win 72
and other possible unique combinations.
The key is to get a score of 30 from the combination where both P1 and P2 win and get 42 from the combination where both P3 and P4 have won and sum them to create the score if these 4 players have played and they have all won.
I can do this with generating unique combinations etc., but in a real use case with larger parameters etc. its too long and results in dirty hard to read code. What I want to know is is there a way to achieve this using only operations such as merge, groupby, join, agg etc.
import pandas as pd
data = {
"Player_1": ["P1", "P1", "P1", "P1", "P2", "P2", "P2", "P2", "P1", "P1", "P1", "P1", "P3", "P3", "P3", "P3"],
"Player_2": ["P2", "P2", "P2", "P2", "P3", "P3", "P3", "P3", "P4", "P4", "P4", "P4", "P4", "P4", "P4", "P4"],
"Outcome_1": ["win", "win", "lose", "lose", "win", "win", "lose", "lose", "win", "win", "lose", "lose", "win", "win", "lose", "lose"],
"Outcome_2": ["win", "lose", "win", "lose", "win", "lose", "win", "lose", "win", "lose", "win", "lose", "win", "lose", "win", "lose"],
"Score": [30, 45, 12, 78, 56, 21, 67, 90, 15, 32, 68, 88, 42, 74, 8, 93]
}
df = pd.DataFrame(data)
print(df)
Player_1 Player_2 Outcome_1 Outcome_2 Score
0 P1 P2 win win 30
1 P1 P2 win lose 45
2 P1 P2 lose win 12
3 P1 P2 lose lose 78
4 P2 P3 win win 56
5 P2 P3 win lose 21
6 P2 P3 lose win 67
7 P2 P3 lose lose 90
8 P1 P4 win win 15
9 P1 P4 win lose 32
10 P1 P4 lose win 68
11 P1 P4 lose lose 88
12 P3 P4 win win 42
13 P3 P4 win lose 74
14 P3 P4 lose win 8
15 P3 P4 lose lose 93
I hope I've understood your question right. From the comments I'm supposing you have total 4 players in the dataframe:
from itertools import product
p1, p2, p3, p4 = np.unique(df[["Player_1", "Player_2"]].values)
df = df.set_index(["Player_1", "Player_2", "Outcome_1", "Outcome_2"])
all_data = []
for p1o1, p2o2, p3o1, p4o2 in product(["win", "lose"], repeat=4):
all_data.append(
(
p1,
p2,
p3,
p4,
p1o1,
p2o2,
p3o1,
p4o2,
df.loc[(p1, p2, p1o1, p2o2), "Score"]
+ df.loc[(p3, p4, p3o1, p4o2), "Score"],
)
)
out = pd.DataFrame(
all_data,
columns=[
"Player_1",
"Player_2",
"Player_3",
"Player_4",
"Outcome_1",
"Outcome_2",
"Outcome_3",
"Outcome_4",
"Score",
],
)
Prints:
Player_1 Player_2 Player_3 Player_4 Outcome_1 Outcome_2 Outcome_3 Outcome_4 Score
0 P1 P2 P3 P4 win win win win 72
1 P1 P2 P3 P4 win win win lose 104
2 P1 P2 P3 P4 win win lose win 38
3 P1 P2 P3 P4 win win lose lose 123
4 P1 P2 P3 P4 win lose win win 87
5 P1 P2 P3 P4 win lose win lose 119
6 P1 P2 P3 P4 win lose lose win 53
7 P1 P2 P3 P4 win lose lose lose 138
8 P1 P2 P3 P4 lose win win win 54
9 P1 P2 P3 P4 lose win win lose 86
10 P1 P2 P3 P4 lose win lose win 20
11 P1 P2 P3 P4 lose win lose lose 105
12 P1 P2 P3 P4 lose lose win win 120
13 P1 P2 P3 P4 lose lose win lose 152
14 P1 P2 P3 P4 lose lose lose win 86
15 P1 P2 P3 P4 lose lose lose lose 171
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With