How to group a pandas dataframe by a defined time interval?

Tags:

I have a dataFrame like this, I would like to group every 60 minutes and start grouping at 06:30.

                           data index 2017-02-14 06:29:57    11198648 2017-02-14 06:30:01    11198650 2017-02-14 06:37:22    11198706 2017-02-14 23:11:13    11207728 2017-02-14 23:21:43    11207774 2017-02-14 23:22:36    11207776

I am using:

Click to copy

df.groupby(pd.TimeGrouper(freq='60Min'))

I get this grouping:

Click to copy

                      data index        2017-02-14 06:00:00     x1 2017-02-14 07:00:00     x2 2017-02-14 08:00:00     x3 2017-02-14 09:00:00     x4 2017-02-14 10:00:00     x5

but I am looking for this result:

Click to copy

                      data index        2017-02-14 06:30:00     x1 2017-02-14 07:30:00     x2 2017-02-14 08:30:00     x3 2017-02-14 09:30:00     x4 2017-02-14 10:30:00     x5

How can I tell the function to start grouping at 6:30 at one-hour intervals?

If it can not be done by the .groupby(pd.TimeGrouper(freq='60Min')), how is the best way to do it?

A salute and thanks very much in advance

839

asked Feb 15 '17 16:02

EduardoRL

2 Answers

Use base=30 in conjunction with label='right' parameters in pd.Grouper.

Specifying label='right' makes the time-period to start grouping from 6:30 (higher side) and not 5:30. Also, base is set to 0 by default, hence the need to offset those by 30 to account for the forward propagation of dates.

Suppose, you want to aggregate the first element of every sub-group, then:

Click to copy

df.groupby(pd.Grouper(freq='60Min', base=30, label='right')).first() # same thing using resample - df.resample('60Min', base=30, label='right').first()

yields:

Click to copy

                           data index                           2017-02-14 06:30:00  11198648.0 2017-02-14 07:30:00  11198650.0 2017-02-14 08:30:00         NaN 2017-02-14 09:30:00         NaN 2017-02-14 10:30:00         NaN 2017-02-14 11:30:00         NaN 2017-02-14 12:30:00         NaN 2017-02-14 13:30:00         NaN 2017-02-14 14:30:00         NaN 2017-02-14 15:30:00         NaN 2017-02-14 16:30:00         NaN 2017-02-14 17:30:00         NaN 2017-02-14 18:30:00         NaN 2017-02-14 19:30:00         NaN 2017-02-14 20:30:00         NaN 2017-02-14 21:30:00         NaN 2017-02-14 22:30:00         NaN 2017-02-14 23:30:00  11207728.0

160

answered Sep 16 '22 17:09

Nickil Maveli

Using DataFrame.resample which is a dedicated method for resampling time series, this way we dont need DataFrame.GroupBy and pd.Grouper:

Click to copy

df.resample('60min', base=30, label='right').first()

Output

Click to copy

                           data index                           2017-02-14 06:30:00  11198648.0 2017-02-14 07:30:00  11198650.0 2017-02-14 08:30:00         NaN 2017-02-14 09:30:00         NaN 2017-02-14 10:30:00         NaN 2017-02-14 11:30:00         NaN 2017-02-14 12:30:00         NaN 2017-02-14 13:30:00         NaN 2017-02-14 14:30:00         NaN 2017-02-14 15:30:00         NaN 2017-02-14 16:30:00         NaN 2017-02-14 17:30:00         NaN 2017-02-14 18:30:00         NaN 2017-02-14 19:30:00         NaN 2017-02-14 20:30:00         NaN 2017-02-14 21:30:00         NaN 2017-02-14 22:30:00         NaN 2017-02-14 23:30:00  11207728.0

Notice: when you have multiple columns in your dataframe, you have to specify the column you want to aggregate on:

Click to copy

df.resample('60min', base=30, label='right')['data'].first()

answered Sep 20 '22 17:09

Erfan

Related questions
                            
                                python - check if any value of dict is not None (without iterators)
                            
                                Web scraping - how to access content rendered in JavaScript via Angular.js?
                            
                                keras: what is the difference between model.predict and model.predict_proba
                            
                                Why is deque implemented as a linked list instead of a circular array?
                            
                                How to specify in the pipfile package from custom git branch using pipfile?
                            
                                Deprecation warning from Jupyter: "`should_run_async` will not call `transform_cell` automatically in the future"
                            
                                Concurrency: Are Python extensions written in C/C++ affected by the Global Interpreter Lock?
                            
                                What is the analog for .Net InvalidOperationException in Python?
                            
                                Python and ctypes: how to correctly pass "pointer-to-pointer" into DLL?
                            
                                What are the benefits of pip and virtualenv?
                            
                                Python: Mock side_effect on object attribute
                            
                                append subprocess.Popen output to file?
                            
                                Variable scope and Try Catch in python
                            
                                Cannot install py2exe with Python 2.7
                            
                                How to get SVMs to play nicely with missing data in scikit-learn?
                            
                                How to open ssl socket using certificate stored in string variables in python
                            
                                IncompleteRead using httplib
                            
                                How to save "complete webpage" not just basic html using Python
                            
                                next() doesn't play nice with any/all in python
                            
                                What's the deal with Python 3.4, Unicode, different languages and Windows?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to group a pandas dataframe by a defined time interval?

Tags:

python

datetime

pandas

group-by

EduardoRL

People also ask

2 Answers

Nickil Maveli

Erfan

Recent Activity

Donate For Us