I have an <code>events</code> table with two columns <code>eventkey</code> (unique, primary-key) and <code>createtime</code>, which stores the creation time of the event as the number of milliseconds since Jan 1 1970 in a <code>NUMBER</code> column. I would like to create a "histogram" or frequency distribution that shows me how many events were created in each hour of the past week. Is this the best way to write such a query in Oracle, using the <code>width_bucket()</code> function? Is it possible to derive the number of rows that fall into each bucket using one of the other Oracle analytic functions rather than using <code>width_bucket</code> to determine what bucket number each row belongs to and doing a <code>count(*)</code> over that? <pre class="prettyprint"><code>-- 1305504000000 = 5/16/2011 12:00am GMT -- 1306108800000 = 5/23/2011 12:00am GMT select timestamp '1970-01-01 00:00:00' + numtodsinterval((1305504000000/1000 + (bucket * 60 * 60)), 'second') period_start, numevents from ( select bucket, count(*) as events from ( select eventkey, createtime, width_bucket(createtime, 1305504000000, 1306108800000, 24 * 7) bucket from events where createtime between 1305504000000 and 1306108800000 ) group by bucket ) order by period_start </code></pre>

I'm unfamiliar with Oracle's date functions, but I'm pretty certain there's an equivalent way of writing this Postgres statement: <pre class="prettyprint"><code>select date_trunc('hour', stamp), count(*) from your_data group by date_trunc('hour', stamp) order by date_trunc('hour', stamp) </code></pre>

Optimal way to create a histogram/frequency distribution in Oracle?

Tags:

sql

oracle

histogram

frequency-distribution

I have an events table with two columns eventkey (unique, primary-key) and createtime, which stores the creation time of the event as the number of milliseconds since Jan 1 1970 in a NUMBER column.

I would like to create a "histogram" or frequency distribution that shows me how many events were created in each hour of the past week.

Is this the best way to write such a query in Oracle, using the width_bucket() function? Is it possible to derive the number of rows that fall into each bucket using one of the other Oracle analytic functions rather than using width_bucket to determine what bucket number each row belongs to and doing a count(*) over that?

-- 1305504000000 = 5/16/2011 12:00am GMT
-- 1306108800000 = 5/23/2011 12:00am GMT
select 
timestamp '1970-01-01 00:00:00' + numtodsinterval((1305504000000/1000 + (bucket * 60 * 60)), 'second') period_start,
numevents
from (
  select bucket, count(*) as events from (
    select eventkey, createtime, 
    width_bucket(createtime, 1305504000000, 1306108800000, 24 * 7) bucket
    from events 
    where createtime between 1305504000000 and 1306108800000
  ) group by bucket
) 
order by period_start

411

asked Jun 01 '11 13:06

matt b

2 Answers

If your createtime were a date column, this would be trivial:

SELECT TO_CHAR(CREATE_TIME, 'DAY:HH24'), COUNT(*) 
  FROM EVENTS
 GROUP BY TO_CHAR(CREATE_TIME, 'DAY:HH24');

As it is, casting the createtime column isn't too hard:

select TO_CHAR( 
         TO_DATE('19700101', 'YYYYMMDD') + createtime / 86400000), 
         'DAY:HH24') AS BUCKET, COUNT(*)
   FROM EVENTS
  WHERE createtime between 1305504000000 and 1306108800000
 group by TO_CHAR( 
         TO_DATE('19700101', 'YYYYMMDD') + createtime / 86400000), 
         'DAY:HH24') 
 order by 1

If, alternatively, you're looking for the fencepost values (for example, where do I go from the first decile (0-10%) to the next (11-20%), you'd do something like:

select min(createtime) over (partition by decile) as decile_start,
       max(createtime) over (partition by decile) as decile_end,
       decile
  from (select createtime, 
               ntile (10) over (order by createtime asc) as decile
          from events
         where createtime between 1305504000000 and 1306108800000
       )

answered Nov 05 '22 09:11

Adam Musch

I'm unfamiliar with Oracle's date functions, but I'm pretty certain there's an equivalent way of writing this Postgres statement:

select date_trunc('hour', stamp), count(*)
from your_data
group by date_trunc('hour', stamp)
order by date_trunc('hour', stamp)

answered Nov 05 '22 08:11

Denis de Bernardy

Related questions
                            
                                Select date/time groupings in MySQL grouped by hour
                            
                                Fastest way to identify differences between two tables?
                            
                                Oracle syntax error [duplicate]
                            
                                SQL Code Smells
                            
                                Question about joins and table with Millions of rows
                            
                                What causes SQL Server to return the message 'The statement has been terminated'?
                            
                                Using references in MYSQL
                            
                                Select a nullable bit with a default value
                            
                                HQL: Fetch Join Collections from Eager Table
                            
                                Files with .sql extension identified as binary in Mercurial [duplicate]
                            
                                SQL fundamental question '!=' vs '<>' vs 'Not' [duplicate]
                            
                                SybaseDB , change the default value of an existing column in a table
                            
                                Add a random number between 30 and 300 to an existing field
                            
                                How to parse from SQL Time (String) to java.sql.Time?
                            
                                SQL - Source Control and Schema/Script management
                            
                                What is the ideal indexing strategy for SQL Server?
                            
                                Subtract Values from Two Different Tables
                            
                                sqlite and 'constraint failed' error while select and insert at the same time
                            
                                ORA-00937: not a single-group group function
                            
                                IF statement inside where clause in SQL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With