I am fairly new to python and pandas, so I apologise for any future misunderstandings.
I have a pandas DataFrame with hourly values, looking something like this:
2014-04-01 09:00:00 52.9 41.1 36.3
2014-04-01 10:00:00 56.4 41.6 70.8
2014-04-01 11:00:00 53.3 41.2 49.6
2014-04-01 12:00:00 50.4 39.5 36.6
2014-04-01 13:00:00 51.1 39.2 33.3
2016-11-30 16:00:00 16.0 13.5 36.6
2016-11-30 17:00:00 19.6 17.4 44.3
Now I need to calculate 24h average values for each column starting from 2014-04-01 12:00 to 2014-04-02 11:00 So I want daily averages from noon to noon.
Unfortunately, I have no idea how to do that. I have read some suggestions to use groupby, but I don't really know how...
Thank you very much in advance! Any help is appreciated!!
For newer versions of pandas (>= 1.1.0) use the offset
argument:
df.resample('24H', offset='12H').mean()
base
argument.A day is 24 hours, so a base of 12 would start the grouping from Noon - Noon. Resample gives you all days in between, so you could .dropna(how='all')
if you don't need the complete basis. (I assume you have a DatetimeIndex
, if not you can use the on
argument of resample to specify your datetime column.)
df.resample('24H', base=12).mean()
#df.groupby(pd.Grouper(level=0, base=12, freq='24H')).mean() # Equivalent
1 2 3
0
2014-03-31 12:00:00 54.20 41.30 52.233333
2014-04-01 12:00:00 50.75 39.35 34.950000
2014-04-02 12:00:00 NaN NaN NaN
2014-04-03 12:00:00 NaN NaN NaN
2014-04-04 12:00:00 NaN NaN NaN
... ... ... ...
2016-11-26 12:00:00 NaN NaN NaN
2016-11-27 12:00:00 NaN NaN NaN
2016-11-28 12:00:00 NaN NaN NaN
2016-11-29 12:00:00 NaN NaN NaN
2016-11-30 12:00:00 17.80 15.45 40.450000
You could subtract your time and groupby:
df.groupby((df.index - pd.to_timedelta('12:00:00')).normalize()).mean()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With