My data is:
>>> ts = pd.TimeSeries(data,indexconv)
>>> tsgroup = ts.resample('t',how='sum')
>>> tsgroup
2014-11-08 10:30:00 3
2014-11-08 10:31:00 4
2014-11-08 10:32:00 7
[snip]
2014-11-08 10:54:00 5
2014-11-08 10:55:00 2
Freq: T, dtype: int64
>>> tsgroup.plot()
>>> plt.show()
indexconv
are strings converted using datetime.strptime
.
The plot is very edgy like this (these aren't my actual plots):
How can I smooth it out like this:
I know about scipy.interpolate
mentioned in this article (which is where I got the images from), but how can I apply it for Pandas time series?
I found this great library called Vincent that deals with Pandas, but it doesn't support Python 2.6.
To make time series data more smooth in Pandas, we can use the exponentially weighted window functions and calculate the exponentially weighted average.
Got it. With help from this question, here's what I did:
Resample my tsgroup
from minutes to seconds.
\>>> tsres = tsgroup.resample('S') \>>> tsres 2014-11-08 10:30:00 3 2014-11-08 10:30:01 NaN 2014-11-08 10:30:02 NaN 2014-11-08 10:30:03 NaN ... 2014-11-08 10:54:58 NaN 2014-11-08 10:54:59 NaN 2014-11-08 10:55:00 2 Freq: S, Length: 1501
Interpolate the data using .interpolate(method='cubic')
. This passes the data to scipy.interpolate.interp1d
and uses the cubic
kind, so you need to have scipy installed (pip install scipy
) 1.
\>>> tsint = tsres.interpolate(method='cubic') \>>> tsint 2014-11-08 10:30:00 3.000000 2014-11-08 10:30:01 3.043445 2014-11-08 10:30:02 3.085850 2014-11-08 10:30:03 3.127220 ... 2014-11-08 10:54:58 2.461532 2014-11-08 10:54:59 2.235186 2014-11-08 10:55:00 2.000000 Freq: S, Length: 1501
Plot it using tsint.plot()
. Here's a comparison between the original tsgroup
and tsint
:
1 If you're getting an error from .interpolate(method='cubic')
telling you that Scipy isn't installed even if you do have it installed, open up /usr/lib64/python2.6/site-packages/scipy/interpolate/polyint.py
or wherever your file might be and change the second line from from scipy import factorial
to from scipy.misc import factorial
.
You can smooth out your data with moving averages as well, effectively applying a low-pass filter to your data. Pandas supports this with the rolling()
method.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With