Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating returns from a dataframe with financial data

Tags:

I have a dataframe with monthly financial data:

In [89]: vfiax_monthly.head() Out[89]:              year  month  day       d   open  close   high    low  volume  aclose 2003-01-31  2003      1   31  731246  64.95  64.95  64.95  64.95       0   64.95 2003-02-28  2003      2   28  731274  63.98  63.98  63.98  63.98       0   63.98 2003-03-31  2003      3   31  731305  64.59  64.59  64.59  64.59       0   64.59 2003-04-30  2003      4   30  731335  69.93  69.93  69.93  69.93       0   69.93 2003-05-30  2003      5   30  731365  73.61  73.61  73.61  73.61       0   73.61 

I'm trying to calculate the returns like that:

In [90]: returns = (vfiax_monthly.open[1:] - vfiax_monthly.open[:-1])/vfiax_monthly.open[1:] 

But I'm getting only zeroes:

In [91]: returns.head() Out[91]:  2003-01-31   NaN 2003-02-28     0 2003-03-31     0 2003-04-30     0 2003-05-30     0 Freq: BM, Name: open 

I think that's because the arithmetic operations get aligned on the index and that makes the [1:] and [:-1] useless.

My workaround is:

In [103]: returns = (vfiax_monthly.open[1:].values - vfiax_monthly.open[:-1].values)/vfiax_monthly.open[1:].values  In [104]: returns = pd.Series(returns, index=vfiax_monthly.index[1:])  In [105]: returns.head() Out[105]:  2003-02-28   -0.015161 2003-03-31    0.009444 2003-04-30    0.076362 2003-05-30    0.049993 2003-06-30    0.012477 Freq: BM 

Is there a better way to calculate the returns? I don't like the conversion to array and then back to Series.

like image 493
Daniel Avatar asked Nov 14 '12 19:11

Daniel


People also ask

Which function returns the total values present in the DataFrame?

sum() function return the sum of the values for the requested axis. If the input is index axis then it adds all the values in a column and repeats the same for all the columns and returns a series containing the sum of all the values in each column.

How do I retrieve columns from a data frame?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.


2 Answers

Instead of slicing, use .shift to move the index position of values in a DataFrame/Series. For example:

returns = (vfiax_monthly.open - vfiax_monthly.open.shift(1))/vfiax_monthly.open.shift(1) 

This is what pct_change is doing under the bonnet. You can also use it for other functions e.g.:

(3*vfiax_monthly.open + 2*vfiax_monthly.open.shift(1))/5 

You might also want to looking into the rolling and window functions for other types of analysis of financial data.

like image 109
Matti John Avatar answered Oct 13 '22 22:10

Matti John


The easiest way to do this is to use the DataFrame.pct_change() method.

Here is a quick example

In[1]: aapl = get_data_yahoo('aapl', start='11/1/2012', end='11/13/2012')  In[2]: appl Out[2]:            Open    High     Low   Close    Volume  Adj Close Date                                                            2012-11-01  598.22  603.00  594.17  596.54  12903500     593.83 2012-11-02  595.89  596.95  574.75  576.80  21406200     574.18 2012-11-05  583.52  587.77  577.60  584.62  18897700     581.96 2012-11-06  590.23  590.74  580.09  582.85  13389900     580.20 2012-11-07  573.84  574.54  555.75  558.00  28344600     558.00 2012-11-08  560.63  562.23  535.29  537.75  37719500     537.75 2012-11-09  540.42  554.88  533.72  547.06  33211200     547.06 2012-11-12  554.15  554.50  538.65  542.83  18421500     542.83 2012-11-13  538.91  550.48  536.36  542.90  19033900     542.90  In[3]: aapl.pct_change() Out[3]:                 Open      High       Low     Close    Volume  Adj Close Date                                                                    2012-11-01       NaN       NaN       NaN       NaN       NaN        NaN 2012-11-02 -0.003895 -0.010033 -0.032684 -0.033091  0.658945  -0.033090 2012-11-05 -0.020759 -0.015378  0.004959  0.013558 -0.117186   0.013550 2012-11-06  0.011499  0.005053  0.004311 -0.003028 -0.291453  -0.003024 2012-11-07 -0.027769 -0.027423 -0.041959 -0.042635  1.116864  -0.038263 2012-11-08 -0.023020 -0.021426 -0.036815 -0.036290  0.330747  -0.036290 2012-11-09 -0.036049 -0.013073 -0.002933  0.017313 -0.119522   0.017313 2012-11-12  0.025406 -0.000685  0.009237 -0.007732 -0.445323  -0.007732 2012-11-13 -0.027502 -0.007250 -0.004251  0.000129  0.033244   0.000129 
like image 35
spencerlyon2 Avatar answered Oct 13 '22 21:10

spencerlyon2