I have a dataframe such as the following:
  Index      Return
2008-11-21   0.153419
2008-11-24   0.037421
2008-11-25   0.077500
What's the best way to calculate a cumulative return across all columns on the last row?
Following is the intended result:
  Index      Return
2008-11-21   0.153419
2008-11-24   0.037421
2008-11-25   0.077500
Cumulative   0.289316
Where cumulative return calculated as follows:
cumulative = (1 + return1) * (1 + return2) * (1 + return3) - 1 
What is the best way to perform this in pandas?
there is a pandas cumprod() method for that. this will work for every columns.
df.ix["Cumulative"] = ((df+1).cumprod()-1).iloc[-1]
this would be about 2 time faster than other solutions on large dataset:
In[106]: %timeit df.ix["Cumulative"] = ((df+1).cumprod()-1).iloc[-1]
10 loops, best of 3: 18.4 ms per loop
In[107]: %timeit df.ix['Cummulative'] = df.apply(lambda x: (x+1).prod()-1)
10 loops, best of 3: 32.9 ms per loop
In[110]: %timeit df.append(df.iloc[:,1:].apply(lambda col: (col + 1).prod() - 1), ignore_index=True)
10 loops, best of 3: 37.1 ms per loop
In[113]: %timeit df.append(df.apply(lambda col: prod([(1+c) for c in col]) - 1), ignore_index=True)
1 loop, best of 3: 262 ms per loop
I would suggest to never use apply if you can find a built-in method since apply is looping over the dataframe which makes it slow. Bult-in method are highly efficient and normally there is no way you are going to get faster than them using apply.
Another solution:
df.ix["Cumulative"] = (df['Return']+1).prod() - 1
This will add 1 to the df['Return'] column, multiply all the rows together, and then subtract one from the result.  This will result in a simple float value.  The result will then be placed at the index "Cumulative".  Since that index doesn't exist yet, it will be appended to the end of the DataFrame:
               Return
2008-11-21   0.153419
2008-11-25   0.077500
2008-11-24   0.037421
Cummulative  0.289316
If you want to apply this across multiple columns:
df.ix['Cummulative'] = df.apply(lambda x: (x+1).prod()-1)
This would output the following (I made a second column called "Return2" that is a copy of "Return"):
               Return   Return2
2008-11-21   0.153419  0.153419
2008-11-25   0.077500  0.077500
2008-11-24   0.037421  0.037421
Cummulative  0.289316  0.289316
                        With pandas, you can use the prod() method:
df.append(df.iloc[:,1:].apply(lambda col: (col + 1).prod() - 1), ignore_index=True)
#        Index    Return
#0  2008-11-21  0.153419
#1  2008-11-24  0.037421
#2  2008-11-25  0.077500
#3         NaN  0.289316
Or as @Randy C commented, this can be further simplified to:
df.append((df.iloc[:,1:] + 1).prod() - 1, ignore_index=True)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With