I have this dataframe
Poloniex_DOGE_BTC Poloniex_XMR_BTC Daily_rets perc_ret
172 0.006085 -0.000839 0.003309 0
173 0.006229 0.002111 0.005135 0
174 0.000000 -0.001651 0.004203 0
175 0.000000 0.007743 0.005313 0
176 0.000000 -0.001013 -0.003466 0
177 0.000000 -0.000550 0.000772 0
178 0.000000 -0.009864 0.001764 0
I'm trying to make a running total of daily_rets in perc_ret
however my code just copies the values from daily_rets
df['perc_ret'] = ( df['Daily_rets'] + df['perc_ret'].shift(1) )
Poloniex_DOGE_BTC Poloniex_XMR_BTC Daily_rets perc_ret
172 0.006085 -0.000839 0.003309 NaN
173 0.006229 0.002111 0.005135 0.005135
174 0.000000 -0.001651 0.004203 0.004203
175 0.000000 0.007743 0.005313 0.005313
176 0.000000 -0.001013 -0.003466 -0.003466
177 0.000000 -0.000550 0.000772 0.000772
178 0.000000 -0.009864 0.001764 0.001764
Cumulative return for the two sub-periods is calculated by multiplying each sub-period's return. Cumulative return = (1+20%)*(1+50%) - 1 = 80%
Cumulative Percentage is calculated by the mathematical formula of dividing the cumulative sum of the column by the mathematical sum of all the values and then multiplying the result by 100. This is also applicable in Pandas Data frames.
If they are daily simple returns and you want a cumulative return, surely you must want a daily compounded number?
df['perc_ret'] = (1 + df.Daily_rets).cumprod() - 1
# Or:
# df.Daily_rets.add(1).cumprod().sub(1)
>>> df
Poloniex_DOGE_BTC Poloniex_XMR_BTC Daily_rets perc_ret
172 0.006085 -0.000839 0.003309 0.003309
173 0.006229 0.002111 0.005135 0.008461
174 0.000000 -0.001651 0.004203 0.012700
175 0.000000 0.007743 0.005313 0.018080
176 0.000000 -0.001013 -0.003466 0.014551
177 0.000000 -0.000550 0.000772 0.015335
178 0.000000 -0.009864 0.001764 0.017126
If they are log returns, then you could just use cumsum
.
you just cannot simply add them all by using cumsum
for example, if you have array [1.1, 1.1], you supposed to have 2.21, not 2.2
import numpy as np
# daily return:
df['daily_return'] = df['close'].pct_change()
# calculate cumluative return
df['cumluative_return'] = np.exp(np.log1p(df['daily_return']).cumsum())
If performance is important, use numpy.cumprod
:
np.cumprod(1 + df['Daily_rets'].values) - 1
Timings:
#7k rows
df = pd.concat([df] * 1000, ignore_index=True)
In [191]: %timeit np.cumprod(1 + df['Daily_rets'].values) - 1
41 µs ± 282 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [192]: %timeit (1 + df.Daily_rets).cumprod() - 1
554 µs ± 3.63 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With