I have a Series with values indexed by timestamps. These timestamps are irregularly spaced and I would like to calculate something like the rolling mean (say) over the last N seconds, where N is a constant. Unfortunately, resampling at regular intervals before calculating the rolling quantity is NOT an option - the rolling quantity has to be calculated on the entire dataset.
Is there a good way to do this in pandas?
You want to reset your index to an integer index and perform the rolling operation on a timestamp column.
# generate some data
data = pd.DataFrame(data={'vals':range(5), 'seed_ts': [np.datetime64('2017-04-13T09:00:00') for x in range(5)]})
data['random_offset'] = [np.timedelta64(randint(0, 5), 's') for x in range(5)]
data['cum_time'] = data['random_offset'].cumsum()
data['ts'] = data['seed_ts'] + data['cum_time']
data.index = data['ts']
data = data[['vals']]
Reset the index:
data = data.reset_index()
Compute the rolling mean over the past 5 seconds:
data['rolling_mean'] = data.rolling('5s', on='ts')['vals'].mean()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With