Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas and Rolling_Mean with Offset (Average Daily Volume Calculation)

When I pull stock data into a dataframe from Yahoo, I want to be able to calculate the 5 day average of volume, excluding the current date.

Is there a way to use rolling mean with an offset? For example, a 5 day mean that excludes current day and is based on the prior 5 days.

When I run the following code

r = DataReader("BBRY", "yahoo", '2015-01-01','2015-01-31')

r['ADV']=pd.rolling_mean(r['Volume'], window=5)

It returns the 5 day volume, inclusive of the current date, so when you look at the below, 1/8 has average volume from 1/2,1/5,1/6,1/7, and 1/8. I would want 1/9 to be the first date that returns average volume and it to contain data from 1/2,1/5,1/6,1/7, and 1/8.

    Date   Open    High     Low   Close    Volume  Adj Close  Symbol           ADV 

1/2/2015  11.01   11.11   10.79   10.82   9733200      10.82    BBRY           NaN
1/5/2015  10.60   10.77   10.37   10.76  12318100      10.76    BBRY           NaN
1/6/2015  10.80   10.85   10.44   10.62  10176400      10.62    BBRY           NaN
1/7/2015  10.65   10.80   10.48   10.67  10277400      10.67    BBRY           NaN
1/8/2015  10.75   10.78   10.57   10.63   6868300      10.63    BBRY  9,874,680.00 
1/9/2015  10.59   10.65   10.28   10.38   7745600      10.38    BBRY  9,477,160.00 
like image 618
Stumbling Through Data Science Avatar asked Feb 08 '16 14:02

Stumbling Through Data Science


1 Answers

In the latest version of pandas (> 0.18.0), the syntax would change to:

df['Volume'].rolling(window=5).mean().shift(1)

like image 157
tsando Avatar answered Sep 30 '22 20:09

tsando