Change rolling window size as it rolls

Tags:

I have a pandas data frame like this;

>df

leg    speed
  1       10
  1       11
  1       12
  1       13
  1       12
  1       15
  1       19
  1       12
  2       10
  2       10
  2       12
  2       15
  2       19
  2       11
  :        :

I want to make a new column roll_speed where it takes a rolling average speed of the last 5 positions. But I wanna put more detailed condition in it.

Groupby leg(it doesn't take into account the speed of the rows in different leg.

I want the rolling window to be changed from 1 to 5 maximum according to the available rows. For example in leg == 1, in the first row there is only one row to calculate, so the rolling speed should be 10/1 = 10. For the second row, there are only two rows available for calculation, the rolling speed should be (10+11)/2 = 10.5.

leg    speed   roll_speed
  1       10           10    # 10/1
  1       11           10.5  # (10+11)/2
  1       12           11    # (10+11+12)/3
  1       13           11.5  # (10+11+12+13)/4
  1       12           11.6  # (10+11+12+13+12)/5
  1       15           12.6  # (11+12+13+12+15)/5
  1       19           14.2  # (12+13+12+15+19)/5
  1       12           14.2  # (13+12+15+19+12)/5
  2       10           10    # 10/1
  2       10           10    # (10+10)/2
  2       12           10.7  # (10+10+12)/3
  2       15           11.8  # (10+10+12+15)/4
  2       19           13.2  # (10+10+12+15+19)/5
  2       11           13.4  # (10+12+15+19+11)/5
  :        :

My attempt:

df['roll_speed'] = df.speed.rolling(5).mean()

But it just returns NA for rows where less than five rows are available for calculation. How should I solve this problem? Thank you for any help!

678

asked Aug 13 '18 15:08

Makoto Miyazaki

Video Answer

2 Answers

Set the parameter min_periods to 1

df['roll_speed'] = df.groupby('leg').speed.rolling(5, min_periods = 1).mean()\
.round(1).reset_index(drop = True)

    leg speed   roll_speed
0   1   10  10.0
1   1   11  10.5
2   1   12  11.0
3   1   13  11.5
4   1   12  11.6
5   1   15  12.6
6   1   19  14.2
7   1   12  14.2
8   2   10  10.0
9   2   10  10.0
10  2   12  10.7
11  2   15  11.8
12  2   19  13.2
13  2   11  13.4

170

answered Oct 18 '22 18:10

Vaishali

Using rolling(5) will get you your results for all but the first 4 occurences of each group. We can fill the remaining values with the expanding mean:

(df.groupby('leg').speed.rolling(5)
    .mean().fillna(df.groupby('leg').speed.expanding().mean())
).reset_index(drop=True)

0     10.000000
1     10.500000
2     11.000000
3     11.500000
4     11.600000
5     12.600000
6     14.200000
7     14.200000
8     10.000000
9     10.000000
10    10.666667
11    11.750000
12    13.200000
13    13.400000
Name: speed, dtype: float64

answered Oct 18 '22 20:10

user3483203

Related questions
                            
                                Python plot image save error
                            
                                How to make n-dimensional nested for-loops in Python? [duplicate]
                            
                                How to get better/accurate results with OCR from low resolution images
                            
                                How to get string dump of lxml Element
                            
                                Perform a conditional operation on a pandas column
                            
                                run python command with alias in command line like npm
                            
                                Recurring payments using Stripe and Django
                            
                                git-flask-python : Is it safe to remove pycache and flask session folders
                            
                                How can I access the filenames gathered by tf.data.Dataset.list_files()?
                            
                                pip3 install PyYAML failed.(python3.7, macOS High Sierra)
                            
                                Shuffle a integer using recursive function
                            
                                Shortest possible generated unique ID
                            
                                How to upload excel or csv file to flask as a Pandas data frame?
                            
                                login() missing 1 required positional argument: 'user'
                            
                                AWS Lambda issue with psycopg2
                            
                                Full list of Python special method names? [duplicate]
                            
                                How to Deal with Redundant/Repeated Imports in Python Modules?
                            
                                Tensorflow what is the difference between None, -1 and ? when specifying tensor shape?
                            
                                How to set primary key auto increment in SqlAlchemy orm
                            
                                Using subprocess in anaconda environment

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Change rolling window size as it rolls

Tags:

python

pandas

average

rolling-average