Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas rolling apply with variable window length

Tags:

python

pandas

I have a Series with values indexed by timestamps. These timestamps are irregularly spaced and I would like to calculate something like the rolling mean (say) over the last N seconds, where N is a constant. Unfortunately, resampling at regular intervals before calculating the rolling quantity is NOT an option - the rolling quantity has to be calculated on the entire dataset.

Is there a good way to do this in pandas?

like image 374
wfh Avatar asked Jun 21 '14 01:06

wfh


1 Answers

You want to reset your index to an integer index and perform the rolling operation on a timestamp column.

# generate some data
data = pd.DataFrame(data={'vals':range(5), 'seed_ts': [np.datetime64('2017-04-13T09:00:00') for x in range(5)]})
data['random_offset'] = [np.timedelta64(randint(0, 5), 's') for x in range(5)]
data['cum_time'] = data['random_offset'].cumsum()
data['ts'] = data['seed_ts'] + data['cum_time']
data.index = data['ts']
data = data[['vals']]

Reset the index:

data = data.reset_index()

Compute the rolling mean over the past 5 seconds:

data['rolling_mean'] = data.rolling('5s', on='ts')['vals'].mean()
like image 106
Abigail Avatar answered Nov 05 '22 11:11

Abigail