I have a data frame and what I am trying to do is essentially tabulate the score of the winning and losing team in the same spot. I have tried to put a lambda function, but have had no success with it. The data frame I currently have is the first one and I would like to create a dataset in the form of the second question. Thanks.
GameId Team Home Score
1 Spirit 1 81
1 Rockers 0 66
2 Lightning 1 73
2 Flames 0 82
Game ID Home Team Away Team Home Score Away Score
1 Spirit Rockers 81 66
2 Lightning Flames 73 82
By using apply() you call a function to every row of pandas DataFrame. Here the add() function will be applied to every row of pandas DataFrame. In order to iterate row by row in apply() function use axis=1 .
Use pandas. To select the rows, the syntax is df. loc[start:stop:step] ; where start is the name of the first-row label to take, stop is the name of the last row label to take, and step as the number of indices to advance after each extraction; for example, you can use it to select alternate rows.
Simply do the following: Select the cell with the formula and the adjacent cells you want to fill. Click Home > Fill, and choose either Down, Right, Up, or Left. Keyboard shortcut: You can also press Ctrl+D to fill the formula down in a column, or Ctrl+R to fill the formula to the right in a row.
Try this:
Input:
import pandas as pd
raw_df = pd.DataFrame({"GameId": [1, 1, 2, 2],
"Team": ["Spirit", "Rockets", "Lighting", "Flames"],
"Home": [1, 0, 1, 0],
"Score": [81, 66, 73, 82]})
print(raw_df)
Output:
GameId Team Home Score
0 1 Spirit 1 81
1 1 Rockets 0 66
2 2 Lighting 1 73
3 2 Flames 0 82
Input:
raw_df.loc[:, "Home"] = raw_df.Home.map({
1: "Home",
0: "Away"
})
result = raw_df.pivot_table(index=["GameId"],
columns=["Home"],
values=["Team", "Score"],
aggfunc={"Team": lambda team: " ".join(team.tolist()),
"Score": lambda score: score})
result = result.sort_index(axis="columns", level=[0, "Home"], ascending=False)
result.columns = [' '.join(reversed(col)) for col in result.columns]
print(result)
Output:
Home Team Away Team Home Score Away Score
GameId
1 Spirit Rockets 81 66
2 Lighting Flames 73 82
import pandas as pd
df=pd.DataFrame({'GameId':[1,1,2,2],'Team': ['Spirit','Rockers','Lighting','Flames'],'Home':[1,0,1,0],'Score':[81,66,73,82]})
merge=pd.merge(df,df,left_on='GameId',right_on='GameId')
merge=merge[merge['Home_x']!=merge['Home_y']]
merge=merge.drop_duplicates(subset=['GameId'])
merge=merge[['GameId','Team_x','Team_y','Score_x','Score_y']]
merge.columns=['GameId','Home Team','Away Team','Home Score','Away Score']
Explanation: using pd.merge(), I am performing a self join. After this, I am removing rows with same team names in both home & away columns. Dropping duplicates on gameId afterwards followed by selecting required columns & renaming them
First use .pivot
and then do some list comprehension to rename the columns from tuples to the desired names (the columns are tuples as a result of setting Home
as a column when pivoting). [::-1]
reverses the name from e.g. Team Home to Home Team, when joining the Tuples in the list comprehension.
df = pd.pivot(df, columns='Home', values=['Team','Score'], index='GameId').reset_index()
df.columns = [' '.join(str(s).strip().replace('1', 'Home').replace('0', 'Away') for s in col[::-1]) for col in df.columns]
Ouput:
GameId Away Team Home Team Away Score Home Score
0 1 Rockers Spirit 66 81
1 2 Flames Lightning 82 73
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With