In R you can compute a rolling mean with a specified window that can shift by a specified amount each time.
However maybe I just haven't found it anywhere but it doesn't seem like you can do it in pandas or some other Python library?
Does anyone know of a way around this? I'll give you an example of what I mean:
Here we have bi-weekly data, and I am computing the two month moving average that shifts by 1 month which is 2 rows.
So in R I would do something like: two_month__movavg=rollapply(mydata,4,mean,by = 2,na.pad = FALSE)
Is there no equivalent in Python?
EDIT1:
DATE A DEMAND ... AA DEMAND A Price
0 2006/01/01 00:30:00 8013.27833 ... 5657.67500 20.03
1 2006/01/01 01:00:00 7726.89167 ... 5460.39500 18.66
2 2006/01/01 01:30:00 7372.85833 ... 5766.02500 20.38
3 2006/01/01 02:00:00 7071.83333 ... 5503.25167 18.59
4 2006/01/01 02:30:00 6865.44000 ... 5214.01500 17.53
Window Rolling Mean (Moving Average)The moving average calculation creates an updated average value for each row based on the window we specify. The calculation is also called a “rolling mean” because it's calculating an average of values within a specified range for each row as you go along the DataFrame.
Vectorization is always the first and best choice. You can convert the data frame to NumPy array or into dictionary format to speed up the iteration workflow. Iterating through the key-value pair of dictionaries comes out to be the fastest way with around 280x times speed up for 20 million records.
Pandas DataFrame: ge() functionThe ge() function returns greater than or equal to of dataframe and other, element-wise. Equivalent to ==, =!, <=, <, >=, > with support to choose axis (rows or columns) and level for comparison. Any single or multiple element data structure, or list-like object.
So, I know it is a long time since the question was asked, by I bumped into this same problem and when dealing with long time series you really would want to avoid the unnecessary calculation of the values you are not interested at. Since Pandas rolling method does not implement a step
argument, I wrote a workaround using numpy.
It is basically a combination of the solution in this link and the indexing proposed by BENY.
def apply_rolling_data(data, col, function, window, step=1, labels=None):
"""Perform a rolling window analysis at the column `col` from `data`
Given a dataframe `data` with time series, call `function` at
sections of length `window` at the data of column `col`. Append
the results to `data` at a new columns with name `label`.
Parameters
----------
data : DataFrame
Data to be analyzed, the dataframe must stores time series
columnwise, i.e., each column represent a time series and each
row a time index
col : str
Name of the column from `data` to be analyzed
function : callable
Function to be called to calculate the rolling window
analysis, the function must receive as input an array or
pandas series. Its output must be either a number or a pandas
series
window : int
length of the window to perform the analysis
step : int
step to take between two consecutive windows
labels : str
Name of the column for the output, if None it defaults to
'MEASURE'. It is only used if `function` outputs a number, if
it outputs a Series then each index of the series is going to
be used as the names of their respective columns in the output
Returns
-------
data : DataFrame
Input dataframe with added columns with the result of the
analysis performed
"""
x = _strided_app(data[col].to_numpy(), window, step)
rolled = np.apply_along_axis(function, 1, x)
if labels is None:
labels = [f"metric_{i}" for i in range(rolled.shape[1])]
for col in labels:
data[col] = np.nan
data.loc[
data.index[
[False]*(window-1)
+ list(np.arange(len(data) - (window-1)) % step == 0)],
labels] = rolled
return data
def _strided_app(a, L, S): # Window len = L, Stride len/stepsize = S
"""returns an array that is strided
"""
nrows = ((a.size-L)//S)+1
n = a.strides[0]
return np.lib.stride_tricks.as_strided(
a, shape=(nrows, L), strides=(S*n, n))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With