I have a dataframe that looks like this:
allHoldingsFund
BrokerBestRate notional_current DistanceBestRate
0 CITI 7.859426e+05 0.023194
1 WFPBS 3.609674e+06 -0.023041
2 WFPBS 1.488828e+06 -0.023041
3 JPM 3.484168e+05 -0.106632
4 CITI 6.088499e+05 0.023194
5 WFPBS 8.665558e+06 -0.023041
6 WFPBS 4.219563e+05 -0.023041
I am trying to do a sum product and a group by in one go (without creating an extra column of sum product)
I have tried this line of code
allHoldingsFund.groupby(['BrokerBestRate'])['notional_current']*['DistanceBestRate'].sum()
how can I do a sum product and then aggregate it using group by?
Desired output
BrokerBestRate product of (notional_current and DistanceBestRate)
CITI 654654645665466
JPM 453454534545367
WFPBS 345345345345435
Many Thanks
You can build the product column before the groupby
df.assign(col=df.notional_current*df.DistanceBestRate).groupby('BrokerBestRate',as_index=False).col.sum()
Out[372]:
BrokerBestRate col
0 CITI 32350.817245
1 JPM -37152.380218
2 WFPBS -326860.001568
The simplest, but typically slowest, way would be to use apply
:
In [43]: df.groupby("BrokerBestRate").apply(lambda x: x.prod(axis=1).sum())
Out[43]:
BrokerBestRate
CITI 32350.817245
JPM -37152.380218
WFPBS -326860.001568
dtype: float64
But you can also compute the product column first, and then call groupby on that:
In [44]: df.eval("notional_current * DistanceBestRate").groupby(df.BrokerBestRate).sum()
Out[44]:
BrokerBestRate
CITI 32350.817245
JPM -37152.380218
WFPBS -326860.001568
dtype: float64
In [45]: df[["notional_current", "DistanceBestRate"]].prod(axis=1).groupby(df["BrokerBestRate"]).sum()
Out[45]:
BrokerBestRate
CITI 32350.817245
JPM -37152.380218
WFPBS -326860.001568
dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With