pandas rolling window & datetime indexes: What does `offset` mean?

Tags:

The rolling window function pandas.DataFrame.rolling of pandas 0.22 takes a window argument that is described as follows:

window : int, or offset

Size of the moving window. This is the number of observations used for calculating the statistic. Each window will be a fixed size.

If its an offset then this will be the time period of each window. Each window will be a variable sized based on the observations included in the time-period. This is only valid for datetimelike indexes. This is new in 0.19.0

What actually is an offset in this context?

627

asked Feb 18 '18 18:02

ascripter

1 Answers

In a nutshell, if you use an offset like "2D" (2 days), pandas will use the datetime info in the index (if available), potentially accounting for any missing rows or irregular frequencies. But if you use a simple int like 2, then pandas will treat the index as a simple integer index [0,1,2,...] and ignore any datetime info in the index.

A simple example should make this clear:

df=pd.DataFrame({'x':range(4)}, 
    index=pd.to_datetime(['1-1-2018','1-2-2018','1-4-2018','1-5-2018']))

            x
2018-01-01  0
2018-01-02  1
2018-01-04  2
2018-01-05  3

Note that (1) the index is a datetime, but also (2) it is missing '2018-01-03'. So if you use a plain integer like 2, rolling will just look at the last two rows, regardless of the datetime value (in a sense it's behaving like iloc[i-1:i] where i is the current row):

df.rolling(2).count()

              x
2018-01-01  1.0
2018-01-02  2.0
2018-01-04  2.0
2018-01-05  2.0

Conversely, if you use an offset of 2 days ('2D'), rolling will use the actual datetime values and accounts for any irregularities in the datetime index.

df.rolling('2D').count()

              x
2018-01-01  1.0
2018-01-02  2.0
2018-01-04  1.0
2018-01-05  2.0

Also note, you need the index to be sorted in ascending order when using a date offset, but it doesn't matter when using a simple integer (since you're just ignoring the index anyway).

166

answered Oct 01 '22 23:10

JohnE

Related questions
                            
                                numpy.shape gives inconsistent responses - why?
                            
                                Why does numpy.r_ use brackets instead of parentheses?
                            
                                python sqlite insert named parameters or null
                            
                                Creating a tree/deeply nested dict from an indented text file in python
                            
                                How do I crop to largest interior bounding box in OpenCV?
                            
                                Pip doesn't install latest available version from pypi (argparse in this case)
                            
                                Creating same random number sequence in Python, NumPy and R
                            
                                How to get SQLite result/error codes in Python
                            
                                How to solve the 10054 error
                            
                                Retrieve the command line arguments of the Python interpreter
                            
                                Most efficient way to remove multiple substrings from string?
                            
                                Customize location of .so file generated by Cython
                            
                                How to cope with the performance of generating signed URLs for accessing private content via CloudFront?
                            
                                In locust How to get a response from one task and pass it to other task
                            
                                np.isnan on arrays of dtype "object"
                            
                                Difference between web-based and executable installers for Python 3 on Windows
                            
                                docker python custom module not found
                            
                                Connect MySQL with Python 3.6 [closed]
                            
                                Removing cached files after a pytest run
                            
                                Write to /tmp directory in aws lambda with python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas rolling window & datetime indexes: What does `offset` mean?

Tags:

python

datetime

pandas

dataframe

ascripter

People also ask

1 Answers

JohnE

Recent Activity

Donate For Us