Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groupby, transform - revert back to original table

I have the following dataframe:

data={"Date":["8/4/2020","8/4/2020","8/4/2020","8/4/2020","8/4/2020"],
      "Weight":[35,40,30,50,50],
      "Subportfolio":["Portfolio1","Portfolio2","Portfolio3","Portfolio4","Portfolio5"],
      "Portfolio Code":[1,1,1,2,2]}
df=pd.DataFrame(data)

The table looks this way:

    Date      Weight      Subportfolio  Portfolio Code
0   8/4/2020    35        Portfolio1          1
1   8/4/2020    40        Portfolio2          1
2   8/4/2020    30        Portfolio3          1
3   8/4/2020    50        Portfolio4          2
4   8/4/2020    50        Portfolio5          2

If you observe the weights where Portfolio Code equals 1 sum to more than 100 so I want to rescale them and have ultimately the same dataframe. Rescaling is done by

df.groupby(by="Portfolio Code").apply(lambda x: x["Weight"]/x["Weight"].sum()*100)

The resulting output is:

Portfolio Code   
1               0    33.333333
                1    38.095238
                2    28.571429
2               3    50.000000
                4    50.000000

Weights with Portfolio Code 1 are already rescaled and sum up to 100. However I don't know how to get back to the original table. I want to have in the end:

    Date      Weight            Subportfolio    Portfolio Code
0   8/4/2020    33.333333         Portfolio1          1
1   8/4/2020    38.095238         Portfolio2          1
2   8/4/2020    28.571429         Portfolio3          1
3   8/4/2020    50                Portfolio4          2
4   8/4/2020    50                Portfolio5          2

I read about using transform on the place of apply but keep on getting errors and don't know how to tackle this.

like image 305
Martin Yordanov Georgiev Avatar asked Dec 10 '25 02:12

Martin Yordanov Georgiev


1 Answers

In the occasion that you want to reassign the groupby result back to the original dataframe, it usually means you want groupby().transform:

df['Weight'] = df.groupby(by="Portfolio Code")['Weight'].transform(lambda x: x/x.sum())

Or slightly better:

df['Weight'] = df['Weight']*100/df.groupby(by="Portfolio Code")['Weight'].transform('sum')

Output:

       Date     Weight Subportfolio  Portfolio Code
0  8/4/2020  33.333333   Portfolio1               1
1  8/4/2020  38.095238   Portfolio2               1
2  8/4/2020  28.571429   Portfolio3               1
3  8/4/2020  50.000000   Portfolio4               2
4  8/4/2020  50.000000   Portfolio5               2
like image 176
Quang Hoang Avatar answered Dec 11 '25 14:12

Quang Hoang



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!