Pandas rolling gives NaN

Tags:

pandas

I'm looking at the tutorials on window functions, but I don't quite understand why the following code produces NaNs.

If I understand correctly, the code creates a rolling window of size 2. Why do the first, fourth, and fifth rows have NaN? At first, I thought it's because adding NaN with another number would produce NaN, but then I'm not sure why the second row wouldn't be NaN.

dft = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},                     index=pd.date_range('20130101 09:00:00', periods=5, freq='s'))   In [58]: dft.rolling(2).sum() Out[58]:                         B 2013-01-01 09:00:00  NaN 2013-01-01 09:00:01  1.0 2013-01-01 09:00:02  3.0 2013-01-01 09:00:03  NaN 2013-01-01 09:00:04  NaN

650

asked Nov 26 '16 01:11

Huey

1 Answers

The first thing to notice is that by default rolling looks for n-1 prior rows of data to aggregate, where n is the window size. If that condition is not met, it will return NaN for the window. This is what's happening at the first row. In the fourth and fifth row, it's because one of the values in the sum is NaN.

If you would like to avoid returning NaN, you could pass min_periods=1 to the method which reduces the minimum required number of valid observations in the window to 1 instead of 2:

>>> dft.rolling(2, min_periods=1).sum()                        B 2013-01-01 09:00:00  0.0 2013-01-01 09:00:01  1.0 2013-01-01 09:00:02  3.0 2013-01-01 09:00:03  2.0 2013-01-01 09:00:04  4.0

answered Sep 22 '22 22:09

Brian Huey

Related questions
                            
                                What does NN VBD IN DT NNS RB means in NLTK?
                            
                                Why are some variables and comments in my IPython notebook red?
                            
                                pandas rounding when converting float to integer
                            
                                How to apply LabelEncoder for a specific column in Pandas dataframe
                            
                                How to check similarity of two images that have different pixelization
                            
                                FFT for Spectrograms in Python
                            
                                How to implement a pythonic equivalent of tail -F?
                            
                                Can SQLAlchemy DateTime Objects Only Be Naive?
                            
                                Are there builtin functions for elementwise boolean operators over boolean lists?
                            
                                Recommended NoSQL Database for use with Python [closed]
                            
                                Overriding special methods on an instance
                            
                                Combine Python Dictionary Permutations into List of Dictionaries
                            
                                Python pandas: select columns with all zero entries in dataframe
                            
                                How to create HTTPS tornado server
                            
                                Using "and" and "or" operator with Python strings [duplicate]
                            
                                NumPy - What is the difference between frombuffer and fromstring?
                            
                                Yield from coroutine vs yield from task
                            
                                How can I normalize the data in a range of columns in my pandas dataframe
                            
                                Python setting Decimal Place range without rounding?
                            
                                Django get_or_create fails to set field when used with iexact

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With