Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert Pandas dataframe to time series

Tags:

pandas

I have a Pandas DataFrame:

Out[57]: 
      lastrun           rate
0   2013-11-04 12:15:02   0
1   2013-11-04 13:14:50   4
2   2013-11-04 14:14:48   10
3   2013-11-04 16:14:59   16

I would like to convert that into an hourly time series and interpolate missing values (15:00) so that I end up with:

2013-11-04 12:00:00   0
2013-11-04 13:00:00   4
2013-11-04 14:00:00   10
2013-11-04 15:00:00   13
2013-11-04 16:00:00   16

How do I convert / map the dataframe data to a time series in Pandas?

like image 866
greenafrican Avatar asked Nov 11 '13 20:11

greenafrican


1 Answers

Assuming your 'lastrun' has datetime objects:

In [22]: s = df.set_index('lastrun').resample('H')['rate']
In [23]: s
Out[23]: 
lastrun
2013-11-04 12:00:00     0
2013-11-04 13:00:00     4
2013-11-04 14:00:00    10
2013-11-04 15:00:00   NaN
2013-11-04 16:00:00    16
Freq: H, dtype: float64

In [24]: s.interpolate()
Out[24]: 
lastrun
2013-11-04 12:00:00     0
2013-11-04 13:00:00     4
2013-11-04 14:00:00    10
2013-11-04 15:00:00    13
2013-11-04 16:00:00    16
Freq: H, dtype: int64

That's if you want linear interpolation. There's a bunch more options in the upcoming .13 release!

like image 108
TomAugspurger Avatar answered Oct 23 '22 15:10

TomAugspurger