Pandas: rolling mean by time interval

Tags:

I've got a bunch of polling data; I want to compute a Pandas rolling mean to get an estimate for each day based on a three-day window. According to this question, the rolling_* functions compute the window based on a specified number of values, and not a specific datetime range.

How do I implement this functionality?

Sample input data:

polls_subset.tail(20) Out[185]:              favorable  unfavorable  other  enddate                                   2012-10-25       0.48         0.49   0.03 2012-10-25       0.51         0.48   0.02 2012-10-27       0.51         0.47   0.02 2012-10-26       0.56         0.40   0.04 2012-10-28       0.48         0.49   0.04 2012-10-28       0.46         0.46   0.09 2012-10-28       0.48         0.49   0.03 2012-10-28       0.49         0.48   0.03 2012-10-30       0.53         0.45   0.02 2012-11-01       0.49         0.49   0.03 2012-11-01       0.47         0.47   0.05 2012-11-01       0.51         0.45   0.04 2012-11-03       0.49         0.45   0.06 2012-11-04       0.53         0.39   0.00 2012-11-04       0.47         0.44   0.08 2012-11-04       0.49         0.48   0.03 2012-11-04       0.52         0.46   0.01 2012-11-04       0.50         0.47   0.03 2012-11-05       0.51         0.46   0.02 2012-11-07       0.51         0.41   0.00

Output would have only one row for each date.

936

asked Apr 02 '13 18:04

Anov

2 Answers

In the meantime, a time-window capability was added. See this link.

In [1]: df = DataFrame({'B': range(5)})  In [2]: df.index = [Timestamp('20130101 09:00:00'),    ...:             Timestamp('20130101 09:00:02'),    ...:             Timestamp('20130101 09:00:03'),    ...:             Timestamp('20130101 09:00:05'),    ...:             Timestamp('20130101 09:00:06')]  In [3]: df Out[3]:                       B 2013-01-01 09:00:00  0 2013-01-01 09:00:02  1 2013-01-01 09:00:03  2 2013-01-01 09:00:05  3 2013-01-01 09:00:06  4  In [4]: df.rolling(2, min_periods=1).sum() Out[4]:                         B 2013-01-01 09:00:00  0.0 2013-01-01 09:00:02  1.0 2013-01-01 09:00:03  3.0 2013-01-01 09:00:05  5.0 2013-01-01 09:00:06  7.0  In [5]: df.rolling('2s', min_periods=1).sum() Out[5]:                         B 2013-01-01 09:00:00  0.0 2013-01-01 09:00:02  1.0 2013-01-01 09:00:03  3.0 2013-01-01 09:00:05  3.0 2013-01-01 09:00:06  7.0

answered Oct 15 '22 17:10

Martin

What about something like this:

First resample the data frame into 1D intervals. This takes the mean of the values for all duplicate days. Use the fill_method option to fill in missing date values. Next, pass the resampled frame into pd.rolling_mean with a window of 3 and min_periods=1 :

pd.rolling_mean(df.resample("1D", fill_method="ffill"), window=3, min_periods=1)              favorable  unfavorable     other enddate 2012-10-25   0.495000     0.485000  0.025000 2012-10-26   0.527500     0.442500  0.032500 2012-10-27   0.521667     0.451667  0.028333 2012-10-28   0.515833     0.450000  0.035833 2012-10-29   0.488333     0.476667  0.038333 2012-10-30   0.495000     0.470000  0.038333 2012-10-31   0.512500     0.460000  0.029167 2012-11-01   0.516667     0.456667  0.026667 2012-11-02   0.503333     0.463333  0.033333 2012-11-03   0.490000     0.463333  0.046667 2012-11-04   0.494000     0.456000  0.043333 2012-11-05   0.500667     0.452667  0.036667 2012-11-06   0.507333     0.456000  0.023333 2012-11-07   0.510000     0.443333  0.013333

UPDATE: As Ben points out in the comments, with pandas 0.18.0 the syntax has changed. With the new syntax this would be:

df.resample("1d").sum().fillna(0).rolling(window=3, min_periods=1).mean()

answered Oct 15 '22 15:10

Zelazny7

Related questions
                            
                                How to get folder name, in which given file resides, from pathlib.path?
                            
                                Prevent pandas from interpreting 'NA' as NaN in a string
                            
                                How to read a Parquet file into Pandas DataFrame?
                            
                                In python, how to import filename starts with a number
                            
                                Python: using a recursive algorithm as a generator
                            
                                Understanding lambda in python and using it to pass multiple arguments
                            
                                Parsing non-zero padded timestamps in Python
                            
                                Full examples of using pySerial package [closed]
                            
                                Python, what's the Enum type good for? [duplicate]
                            
                                Implementing use of 'with object() as f' in custom class in python
                            
                                How to locate and insert a value in a text box (input) using Python Selenium?
                            
                                Python Pandas: Convert ".value_counts" output to dataframe
                            
                                RuntimeError: This event loop is already running in python
                            
                                `if key in dict` vs. `try/except` - which is more readable idiom?
                            
                                Pythonic type hints with pandas?
                            
                                Combine two pandas Data Frames (join on a common column)
                            
                                Django Setup Default Logging
                            
                                Convert Python dictionary to JSON array
                            
                                python: Appending a dictionary to a list - I see a pointer like behavior
                            
                                secret key not set in flask session, using the Flask-Session extension

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas: rolling mean by time interval

Tags:

python

pandas

time-series

rolling-computation

Anov

People also ask

2 Answers

Martin

Zelazny7

Recent Activity

Donate For Us