Assume I have a timestamp column of <code>datetime</code> in a <code>pandas.DataFrame</code>. For the sake of example, the timestamp is in seconds resolution. I would like to bucket / bin the events in 10 minutes [1] buckets / bins. I understand that I can represent the <code>datetime</code> as an integer timestamp and then use histogram. Is there a simpler approach? Something built in into <code>pandas</code>? [1] 10 minutes is only an example. Ultimately, I would like to use different resolutions.

To use custom frequency like "10Min" you have to use a <code>TimeGrouper</code> -- as suggested by @johnchase -- that operates on the <code>index</code>. <pre class="prettyprint"><code># Generating a sample of 10000 timestamps and selecting 500 to randomize them df = pd.DataFrame(np.random.choice(pd.date_range(start=pd.to_datetime('2015-01-14'),periods = 10000, freq='S'), 500), columns=['date']) # Setting the date as the index since the TimeGrouper works on Index, the date column is not dropped to be able to count df.set_index('date', drop=False, inplace=True) # Getting the histogram df.groupby(pd.TimeGrouper(freq='10Min')).count().plot(kind='bar') </code></pre> <img src="https://i.stack.imgur.com/TMTyV.png" alt="enter image description here"> <h3>Using <code>to_period</code> </h3> It is also possible to use the <code>to_period</code> method but it does not work -- as far as I know -- with custom period like "10Min". This example take an additional column to simulate the category of an item. <pre class="prettyprint"><code># The number of sample nb_sample = 500 # Generating a sample and selecting a subset to randomize them df = pd.DataFrame({'date': np.random.choice(pd.date_range(start=pd.to_datetime('2015-01-14'),periods = nb_sample*30, freq='S'), nb_sample), 'type': np.random.choice(['foo','bar','xxx'],nb_sample)}) # Grouping per hour and type df = df.groupby([df['date'].dt.to_period('H'), 'type']).count().unstack() # Droping unnecessary column level df.columns = df.columns.droplevel() df.plot(kind='bar') </code></pre> <img src="https://i.stack.imgur.com/Qh1KS.png" alt="enter image description here">

A per-hour histogram of datetime using Pandas

Tags:

python

datetime

pandas

Assume I have a timestamp column of datetime in a pandas.DataFrame. For the sake of example, the timestamp is in seconds resolution. I would like to bucket / bin the events in 10 minutes [1] buckets / bins. I understand that I can represent the datetime as an integer timestamp and then use histogram. Is there a simpler approach? Something built in into pandas?

[1] 10 minutes is only an example. Ultimately, I would like to use different resolutions.

359

asked Jan 15 '16 15:01

Dror

1 Answers

To use custom frequency like "10Min" you have to use a TimeGrouper -- as suggested by @johnchase -- that operates on the index.

# Generating a sample of 10000 timestamps and selecting 500 to randomize them
df = pd.DataFrame(np.random.choice(pd.date_range(start=pd.to_datetime('2015-01-14'),periods = 10000, freq='S'), 500),  columns=['date'])
# Setting the date as the index since the TimeGrouper works on Index, the date column is not dropped to be able to count
df.set_index('date', drop=False, inplace=True)
# Getting the histogram
df.groupby(pd.TimeGrouper(freq='10Min')).count().plot(kind='bar')

enter image description here

Using `to_period`

It is also possible to use the to_period method but it does not work -- as far as I know -- with custom period like "10Min". This example take an additional column to simulate the category of an item.

# The number of sample
nb_sample = 500
# Generating a sample and selecting a subset to randomize them
df = pd.DataFrame({'date': np.random.choice(pd.date_range(start=pd.to_datetime('2015-01-14'),periods = nb_sample*30, freq='S'), nb_sample),
                  'type': np.random.choice(['foo','bar','xxx'],nb_sample)})

# Grouping per hour and type
df = df.groupby([df['date'].dt.to_period('H'), 'type']).count().unstack()
# Droping unnecessary column level
df.columns = df.columns.droplevel()
df.plot(kind='bar')

enter image description here

111

answered Oct 28 '22 13:10

Romain

Related questions
                            
                                Using WN-Affect to detect emotion/mood of a string
                            
                                Maybe monad in Python with method chaining
                            
                                Django UnitTest with Mock
                            
                                Run python behave from python instead of command line
                            
                                How to generate a valid sample token with stripe?
                            
                                How do I configure mathjax for iPython notebooks?
                            
                                Numpy: Filtering rows by multiple conditions?
                            
                                How to verify a JWT using python PyJWT with a public PEM cert?
                            
                                How to add a screenshot to allure report with python?
                            
                                Continue until all iterators are done Python
                            
                                numpy: fill offset diagonal with different values
                            
                                Concatenate several np arrays in python
                            
                                Iterating through a unicode string in Python
                            
                                Scrapy - No module named mail.smtp
                            
                                Python integer formatting
                            
                                python bin data and return bin midpoint (maybe using pandas.cut and qcut)
                            
                                How can I print the entire contents of Wordnet (preferably with NLTK)?
                            
                                Pyspark dataframe: Summing over a column while grouping over another
                            
                                Pipe opencv images to ffmpeg using python
                            
                                Change plot window size in IPython notebook

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

A per-hour histogram of datetime using Pandas

Tags:

python

datetime

pandas

Dror

People also ask

1 Answers

Using `to_period`

Romain

Recent Activity

Donate For Us

A per-hour histogram of datetime using Pandas

Tags:

python

datetime

pandas

Dror

People also ask

1 Answers

Using to_period

Romain

Related questions

Recent Activity

Donate For Us

Using `to_period`