I want to compute a moving average using a time window over an irregular time series using pandas. Ideally, the window should be exponentially weighted using pandas.DataFrame.ewm, but the arguments (e.g. span
) do not accept time-based windows.
If we try to use pandas.DataFrame.rolling, we realise that we cannot combine time-based windows with win_type.
dft = pd.DataFrame({'B': [0, 1, 2, 3, 4]},
index = pd.Index([pd.Timestamp('20130101 09:00:00'),
pd.Timestamp('20130101 09:00:02'),
pd.Timestamp('20130101 09:00:03'),
pd.Timestamp('20130101 09:00:05'),
pd.Timestamp('20130101 09:00:06')],
name='foo'))
dft.rolling('2s', win_types='triang').sum()
>>> ValueError: Invalid window 2s
How to calculate a not equally weighted time-based moving average over an irregular time series?
The expected output for dft.ewm(alpha=0.9, adjust=False).sum()
associated with a window of '2s'
would be [0*1, 1*1, 2*1+1*0.9, 3*1, 4*1+3*0.9]
Pandas documentation is misleading. As you found out you can't pass an offset while using win_type
. What you can do is pass your own function using .apply
as a workaround. E.g., if you want to use triangle windows:
import pandas as pd
from scipy.signal.windows import triang
dft = pd.DataFrame(
{"B": [0, 1, 2, 3, 4]},
index=pd.Index(
[
pd.Timestamp("20130101 09:00:00"),
pd.Timestamp("20130101 09:00:02"),
pd.Timestamp("20130101 09:00:03"),
pd.Timestamp("20130101 09:00:05"),
pd.Timestamp("20130101 09:00:06"),
],
name="foo",
),
)
def triangle_sum(window):
weights = triang(len(window))
return (weights * window).sum()
dft.rolling("2s").apply(triangle_sum, raw=True)
you can define your own weighting scheme and use Numba for performance, if that's a concern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With