Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

resample dataframe for every hour

Tags:

python

I want to resample the data in Sms ,call and Internet column by replacing the value by their mean for every hour.

Code 1 tried :

df1.reset_index().set_index('TIME').resample('1H').mean()

error:Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

Code 2 tried:

df1['TIME'] = pd.to_datetime(data['TIME'])
df1.CALL.resample('60min', how='mean')

error: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex' Dataframe:

    ID          TIME         SMS          CALL      INTERNET
0   1   2013-11-30 23:00:00 0.277204    0.273629    13.674575
1   1   2013-11-30 23:10:00 0.341536    0.058176    13.330858
2   1   2013-11-30 23:20:00 0.379427    0.054601    11.329552
3   1   2013-11-30 23:30:00 0.600781    0.218489    13.166163
4   1   2013-11-30 23:40:00 0.405565    0.134176    13.347791
5   1   2013-11-30 23:50:00 0.187700    0.080738    12.434744
6   1   2013-12-01 00:00:00 0.282651    0.135964    13.860353
7   1   2013-12-01 00:10:00 0.109826    0.056388    12.583463
8   1   2013-12-01 00:20:00 0.348638    0.053438    12.644995
9   1   2013-12-01 00:30:00 0.138375    0.054062    12.251733
10  1   2013-12-01 00:40:00 0.054062    0.163803    11.292642


df1.dtypes
ID            int64
TIME         object
SMS         float64
CALL        float64
INTERNET    float64
dtype: object
like image 808
Shruti Bothe Avatar asked Dec 23 '22 08:12

Shruti Bothe


1 Answers

You can use parameter on in resample:

on : string, optional

For a DataFrame, column to use instead of index for resampling. Column must be datetime-like.
New in version 0.19.0.

df1['TIME'] = pd.to_datetime(df1['TIME'])
df = df1.resample('60min', on='TIME').mean()
print (df)
                     ID       SMS      CALL   INTERNET
TIME                                                  
2013-11-30 23:00:00   1  0.365369  0.136635  12.880614
2013-12-01 00:00:00   1  0.186710  0.092731  12.526637

Or add set_index for DatetimeIndex:

df1['TIME'] = pd.to_datetime(df1['TIME'])
df = df1.set_index('TIME').resample('60min').mean()
like image 105
jezrael Avatar answered Jan 05 '23 03:01

jezrael