Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate daily averages from noon to noon with pandas?

I am fairly new to python and pandas, so I apologise for any future misunderstandings.

I have a pandas DataFrame with hourly values, looking something like this:

2014-04-01 09:00:00 52.9    41.1    36.3

2014-04-01 10:00:00 56.4    41.6    70.8

2014-04-01 11:00:00 53.3    41.2    49.6

2014-04-01 12:00:00 50.4    39.5    36.6

2014-04-01 13:00:00 51.1    39.2    33.3

2016-11-30 16:00:00 16.0    13.5    36.6

2016-11-30 17:00:00 19.6    17.4    44.3

Now I need to calculate 24h average values for each column starting from 2014-04-01 12:00 to 2014-04-02 11:00 So I want daily averages from noon to noon.

Unfortunately, I have no idea how to do that. I have read some suggestions to use groupby, but I don't really know how...

Thank you very much in advance! Any help is appreciated!!

like image 876
NeedHelp Avatar asked Nov 14 '19 15:11

NeedHelp


2 Answers

For newer versions of pandas (>= 1.1.0) use the offset argument:

df.resample('24H', offset='12H').mean()

The base argument.

A day is 24 hours, so a base of 12 would start the grouping from Noon - Noon. Resample gives you all days in between, so you could .dropna(how='all') if you don't need the complete basis. (I assume you have a DatetimeIndex, if not you can use the on argument of resample to specify your datetime column.)

df.resample('24H', base=12).mean()
#df.groupby(pd.Grouper(level=0, base=12, freq='24H')).mean() # Equivalent 

                         1      2          3
0                                           
2014-03-31 12:00:00  54.20  41.30  52.233333
2014-04-01 12:00:00  50.75  39.35  34.950000
2014-04-02 12:00:00    NaN    NaN        NaN
2014-04-03 12:00:00    NaN    NaN        NaN
2014-04-04 12:00:00    NaN    NaN        NaN
...                    ...    ...        ...
2016-11-26 12:00:00    NaN    NaN        NaN
2016-11-27 12:00:00    NaN    NaN        NaN
2016-11-28 12:00:00    NaN    NaN        NaN
2016-11-29 12:00:00    NaN    NaN        NaN
2016-11-30 12:00:00  17.80  15.45  40.450000
like image 100
ALollz Avatar answered Nov 10 '22 21:11

ALollz


You could subtract your time and groupby:

df.groupby((df.index - pd.to_timedelta('12:00:00')).normalize()).mean()
like image 3
Quang Hoang Avatar answered Nov 10 '22 20:11

Quang Hoang