Is it possible to use pandas.DataFrame.rolling with a step greater than 1?

Tags:

In R you can compute a rolling mean with a specified window that can shift by a specified amount each time.

However maybe I just haven't found it anywhere but it doesn't seem like you can do it in pandas or some other Python library?

Does anyone know of a way around this? I'll give you an example of what I mean:

example

Here we have bi-weekly data, and I am computing the two month moving average that shifts by 1 month which is 2 rows.

So in R I would do something like: two_month__movavg=rollapply(mydata,4,mean,by = 2,na.pad = FALSE) Is there no equivalent in Python?

EDIT1:

DATE  A DEMAND   ...     AA DEMAND  A Price
    0  2006/01/01 00:30:00  8013.27833   ...     5657.67500    20.03
    1  2006/01/01 01:00:00  7726.89167   ...     5460.39500    18.66
    2  2006/01/01 01:30:00  7372.85833   ...     5766.02500    20.38
    3  2006/01/01 02:00:00  7071.83333   ...     5503.25167    18.59
    4  2006/01/01 02:30:00  6865.44000   ...     5214.01500    17.53

513

asked Jan 22 '19 03:01

user8261831

1 Answers

So, I know it is a long time since the question was asked, by I bumped into this same problem and when dealing with long time series you really would want to avoid the unnecessary calculation of the values you are not interested at. Since Pandas rolling method does not implement a step argument, I wrote a workaround using numpy.

It is basically a combination of the solution in this link and the indexing proposed by BENY.

def apply_rolling_data(data, col, function, window, step=1, labels=None):
    """Perform a rolling window analysis at the column `col` from `data`

    Given a dataframe `data` with time series, call `function` at
    sections of length `window` at the data of column `col`. Append
    the results to `data` at a new columns with name `label`.

    Parameters
    ----------
    data : DataFrame
        Data to be analyzed, the dataframe must stores time series
        columnwise, i.e., each column represent a time series and each
        row a time index
    col : str
        Name of the column from `data` to be analyzed
    function : callable
        Function to be called to calculate the rolling window
        analysis, the function must receive as input an array or
        pandas series. Its output must be either a number or a pandas
        series
    window : int
        length of the window to perform the analysis
    step : int
        step to take between two consecutive windows
    labels : str
        Name of the column for the output, if None it defaults to
        'MEASURE'. It is only used if `function` outputs a number, if
        it outputs a Series then each index of the series is going to
        be used as the names of their respective columns in the output

    Returns
    -------
    data : DataFrame
        Input dataframe with added columns with the result of the
        analysis performed

    """

    x = _strided_app(data[col].to_numpy(), window, step)
    rolled = np.apply_along_axis(function, 1, x)

    if labels is None:
        labels = [f"metric_{i}" for i in range(rolled.shape[1])]

    for col in labels:
        data[col] = np.nan

    data.loc[
        data.index[
            [False]*(window-1)
            + list(np.arange(len(data) - (window-1)) % step == 0)],
        labels] = rolled

    return data


def _strided_app(a, L, S):  # Window len = L, Stride len/stepsize = S
    """returns an array that is strided
    """
    nrows = ((a.size-L)//S)+1
    n = a.strides[0]
    return np.lib.stride_tricks.as_strided(
        a, shape=(nrows, L), strides=(S*n, n))

133

answered Sep 21 '22 18:09

pgaluzio

Related questions
                            
                                Selenium + ChromeDriver printToPDF
                            
                                changing arrowhead type in networkx
                            
                                How do I embed a Flask-Security login form on my page?
                            
                                disk I/O error with SQLite3 in Python 3 when writing to a database
                            
                                Why is this warning "Expected type 'int' (matched generic type '_T'), got 'Dict[str, None]' instead"?
                            
                                How to display a pandas dataframe as datatable?
                            
                                Running Flask & a Discord bot in the same application
                            
                                Empty class with comment same as pass?
                            
                                How to cancel the effect of numpy seed()?
                            
                                Massive overfit during resnet50 transfer learning
                            
                                How can I specify the figsize of a graphviz representation of a decision tree?
                            
                                python pytest occasionally fails with OSError: reading from stdin while output is captured
                            
                                Does EarlyStopping in Keras save the best model?
                            
                                Prevent pip from installing some dependencies
                            
                                Efficient way to add new column to pandas dataframe
                            
                                Order bar chart in Altair?
                            
                                How to set an Array column with an empty array as default in SQLAlchemy + Postgres
                            
                                Repeating array with transformation
                            
                                pandas.read_feather got an unexpected argument nthreads
                            
                                Where in Django can I run startup code that requires models?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it possible to use pandas.DataFrame.rolling with a step greater than 1?

Tags:

python

pandas

r

numpy

zoo

user8261831

People also ask

1 Answers

pgaluzio

Recent Activity

Donate For Us