Groupby, transform - revert back to original table

Question

I have the following dataframe:

data={"Date":["8/4/2020","8/4/2020","8/4/2020","8/4/2020","8/4/2020"],
      "Weight":[35,40,30,50,50],
      "Subportfolio":["Portfolio1","Portfolio2","Portfolio3","Portfolio4","Portfolio5"],
      "Portfolio Code":[1,1,1,2,2]}
df=pd.DataFrame(data)

The table looks this way:

    Date      Weight      Subportfolio  Portfolio Code
0   8/4/2020    35        Portfolio1          1
1   8/4/2020    40        Portfolio2          1
2   8/4/2020    30        Portfolio3          1
3   8/4/2020    50        Portfolio4          2
4   8/4/2020    50        Portfolio5          2

If you observe the weights where Portfolio Code equals 1 sum to more than 100 so I want to rescale them and have ultimately the same dataframe. Rescaling is done by

df.groupby(by="Portfolio Code").apply(lambda x: x["Weight"]/x["Weight"].sum()*100)

The resulting output is:

Portfolio Code   
1               0    33.333333
                1    38.095238
                2    28.571429
2               3    50.000000
                4    50.000000

Weights with Portfolio Code 1 are already rescaled and sum up to 100. However I don't know how to get back to the original table. I want to have in the end:

    Date      Weight            Subportfolio    Portfolio Code
0   8/4/2020    33.333333         Portfolio1          1
1   8/4/2020    38.095238         Portfolio2          1
2   8/4/2020    28.571429         Portfolio3          1
3   8/4/2020    50                Portfolio4          2
4   8/4/2020    50                Portfolio5          2

I read about using transform on the place of apply but keep on getting errors and don't know how to tackle this.

Quang Hoang · Accepted Answer

In the occasion that you want to reassign the groupby result back to the original dataframe, it usually means you want groupby().transform:

df['Weight'] = df.groupby(by="Portfolio Code")['Weight'].transform(lambda x: x/x.sum())

Or slightly better:

df['Weight'] = df['Weight']*100/df.groupby(by="Portfolio Code")['Weight'].transform('sum')

Output:

       Date     Weight Subportfolio  Portfolio Code
0  8/4/2020  33.333333   Portfolio1               1
1  8/4/2020  38.095238   Portfolio2               1
2  8/4/2020  28.571429   Portfolio3               1
3  8/4/2020  50.000000   Portfolio4               2
4  8/4/2020  50.000000   Portfolio5               2

Groupby, transform - revert back to original table

Tags:

python

pandas

group-by

apply

Martin Yordanov Georgiev

1 Answers

Quang Hoang

Recent Activity

Donate For Us

Groupby, transform - revert back to original table

Tags:

python

pandas

group-by

apply

Martin Yordanov Georgiev

1 Answers

Quang Hoang

Related questions

Recent Activity

Donate For Us