Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plot smooth curves of Pandas Series data

My data is:

>>> ts = pd.TimeSeries(data,indexconv)
>>> tsgroup = ts.resample('t',how='sum')
>>> tsgroup
2014-11-08 10:30:00    3
2014-11-08 10:31:00    4
2014-11-08 10:32:00    7
  [snip]
2014-11-08 10:54:00    5
2014-11-08 10:55:00    2
Freq: T, dtype: int64
>>> tsgroup.plot()
>>> plt.show()

indexconv are strings converted using datetime.strptime.

The plot is very edgy like this (these aren't my actual plots): enter image description here

How can I smooth it out like this: enter image description here

I know about scipy.interpolate mentioned in this article (which is where I got the images from), but how can I apply it for Pandas time series?

I found this great library called Vincent that deals with Pandas, but it doesn't support Python 2.6.

like image 719
Alaa Ali Avatar asked Nov 24 '14 02:11

Alaa Ali


People also ask

How do you smooth a Pandas time series?

To make time series data more smooth in Pandas, we can use the exponentially weighted window functions and calculate the exponentially weighted average.


2 Answers

Got it. With help from this question, here's what I did:

  1. Resample my tsgroup from minutes to seconds.

    \>>> tsres = tsgroup.resample('S')
    \>>> tsres
    2014-11-08 10:30:00     3
    2014-11-08 10:30:01   NaN
    2014-11-08 10:30:02   NaN
    2014-11-08 10:30:03   NaN
    ...
    2014-11-08 10:54:58   NaN
    2014-11-08 10:54:59   NaN
    2014-11-08 10:55:00     2
    Freq: S, Length: 1501
  2. Interpolate the data using .interpolate(method='cubic'). This passes the data to scipy.interpolate.interp1d and uses the cubic kind, so you need to have scipy installed (pip install scipy) 1.

    \>>> tsint = tsres.interpolate(method='cubic')
    \>>> tsint
    2014-11-08 10:30:00    3.000000
    2014-11-08 10:30:01    3.043445
    2014-11-08 10:30:02    3.085850
    2014-11-08 10:30:03    3.127220
    ...
    2014-11-08 10:54:58    2.461532
    2014-11-08 10:54:59    2.235186
    2014-11-08 10:55:00    2.000000
    Freq: S, Length: 1501
  3. Plot it using tsint.plot(). Here's a comparison between the original tsgroup and tsint:

1 If you're getting an error from .interpolate(method='cubic') telling you that Scipy isn't installed even if you do have it installed, open up /usr/lib64/python2.6/site-packages/scipy/interpolate/polyint.py or wherever your file might be and change the second line from from scipy import factorial to from scipy.misc import factorial.

like image 114
Alaa Ali Avatar answered Sep 28 '22 02:09

Alaa Ali


You can smooth out your data with moving averages as well, effectively applying a low-pass filter to your data. Pandas supports this with the rolling() method.

like image 28
Marcus Avatar answered Sep 28 '22 00:09

Marcus