Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resample daily pandas timeseries with start at time other than midnight [duplicate]

Tags:

python

pandas

I have a pandas timeseries of 10-min freqency data and need to find the maximum value in each 24-hour period. However, this 24-hour period needs to start each day at 5AM - not the default midnight which pandas assumes.

I've been checking out DateOffset but so far am drawing blanks. I might have expected something akin to pandas.tseries.offsets.Week(weekday=n), e.g. pandas.tseries.offsets.Week(hour=5), but this is not supported as far as I can tell.

I can do a nasty work around by shifting the data first, but it's unintuitive and even coming back to the same code after just a week I have problems wrapping my head around the shift direction!

Any more elegant ideas would be much appreciated.

like image 578
ajt Avatar asked Dec 04 '13 11:12

ajt


People also ask

How do you resample by day in Python?

Resample Hourly Data to Daily Dataresample() method. To aggregate or temporal resample the data for a time period, you can take all of the values for each day and summarize them. In this case, you want total daily rainfall, so you will use the resample() method together with . sum() .

How do I resample data in pandas?

Pandas Series: resample() functionThe resample() function is used to resample time-series data. Convenience method for frequency conversion and resampling of time series. Object must have a datetime-like index (DatetimeIndex, PeriodIndex, or TimedeltaIndex), or pass datetime-like values to the on or level keyword.


2 Answers

The base keyword can do the trick (see docs):

s.resample('24h', base=5)

Eg:

In [35]: idx = pd.date_range('2012-01-01 00:00:00', freq='5min', periods=24*12*3)

In [36]: s = pd.Series(np.arange(len(idx)), index=idx)

In [38]: s.resample('24h', base=5)
Out[38]: 
2011-12-31 05:00:00     29.5
2012-01-01 05:00:00    203.5
2012-01-02 05:00:00    491.5
2012-01-03 05:00:00    749.5
Freq: 24H, dtype: float64
like image 134
joris Avatar answered Oct 17 '22 21:10

joris


I've just spotted an answered question which didn't come up on Google or Stack Overflow previously:

Resample hourly TimeSeries with certain starting hour

This uses the base parameter, which looks like an addition subsequent to Wes McKinney's Python for Data Analysis. I've given the parameter a go and it seems to do the trick.

like image 26
ajt Avatar answered Oct 17 '22 21:10

ajt