Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas dataframe resample per day without date time index

I have a dataframe in pandas of the following form:

      timestamps         light
7   2004-02-28 00:58:45 150.88
26  2004-02-28 00:59:45 143.52
34  2004-02-28 01:00:45 150.88
42  2004-02-28 01:01:15 150.88
59  2004-02-28 01:02:15 150.88

Here note that the index is not the timestamps column. But I want to resample (or bin the data somehow) to reflect the average value of the light column per minute , hour, day etc.. I have looked into the resample method that pandas offers and it requires the dataframe to have a datatime index for the method to work (unless I've misunderstood this).

  1. So my first question is, can I re-index the dataframe to have timestamps as the index (note that not each row has a unique timestamp and for each timestamp, there are about 30 rows with the same timestamp,each representing a sensor).

  2. If not, is there some other way to possibly achieve another dataframe which has the average value of light per hour , per day , per month etc..?

Any help would be appreciated.

like image 431
Nikhil Avatar asked Jun 15 '16 17:06

Nikhil


1 Answers

For pandas version 0.19.0 and newer you can use the on keyword:

df.resample('H', on='timestamps').mean()

Result:

                      light
timestamps                 
2004-02-28 00:00:00  147.20
2004-02-28 01:00:00  150.88
like image 83
Stef Avatar answered Sep 22 '22 03:09

Stef