Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grouping into interval of 5 minutes within a time range

I have some difficulties with mySQL commands that I want to do.

SELECT a.timestamp, name, count(b.name)  FROM time a, id b  WHERE a.user = b.user   AND a.id = b.id   AND b.name = 'John'   AND a.timestamp BETWEEN '2010-11-16 10:30:00' AND '2010-11-16 11:00:00'  GROUP BY a.timestamp 

This is my current output statement.

timestamp            name  count(b.name) -------------------  ----  ------------- 2010-11-16 10:32:22  John  2 2010-11-16 10:35:12  John  7 2010-11-16 10:36:34  John  1 2010-11-16 10:37:45  John  2 2010-11-16 10:48:26  John  8 2010-11-16 10:55:00  John  9 2010-11-16 10:58:08  John  2 

How do I group them into 5 minutes interval results?

I want my output to be like

timestamp            name  count(b.name) -------------------  ----  ------------- 2010-11-16 10:30:00  John  2 2010-11-16 10:35:00  John  10 2010-11-16 10:40:00  John  0 2010-11-16 10:45:00  John  8 2010-11-16 10:50:00  John  0 2010-11-16 10:55:00  John  11  
like image 984
sky Avatar asked Dec 03 '10 04:12

sky


2 Answers

This works with every interval.

PostgreSQL

SELECT     TIMESTAMP WITH TIME ZONE 'epoch' +     INTERVAL '1 second' * round(extract('epoch' from timestamp) / 300) * 300 as timestamp,     name,     count(b.name) FROM time a, id  WHERE … GROUP BY  round(extract('epoch' from timestamp) / 300), name 


MySQL

SELECT     timestamp,  -- not sure about that     name,     count(b.name) FROM time a, id  WHERE … GROUP BY  UNIX_TIMESTAMP(timestamp) DIV 300, name 
like image 115
boecko Avatar answered Sep 19 '22 17:09

boecko


I came across the same issue.

I found that it is easy to group by any minute interval is just dividing epoch by minutes in amount of seconds and then either rounding or using floor to get ride of the remainder. So if you want to get interval in 5 minutes you would use 300 seconds.

    SELECT COUNT(*) cnt,      to_timestamp(floor((extract('epoch' from timestamp_column) / 300 )) * 300)      AT TIME ZONE 'UTC' as interval_alias     FROM TABLE_NAME GROUP BY interval_alias 
interval_alias       cnt -------------------  ----   2010-11-16 10:30:00  2 2010-11-16 10:35:00  10 2010-11-16 10:45:00  8 2010-11-16 10:55:00  11  

This will return the data correctly group by the selected minutes interval; however, it will not return the intervals that don't contains any data. In order to get those empty intervals we can use the function generate_series.

    SELECT generate_series(MIN(date_trunc('hour',timestamp_column)),     max(date_trunc('minute',timestamp_column)),'5m') as interval_alias FROM      TABLE_NAME 

Result:

interval_alias        -------------------     2010-11-16 10:30:00   2010-11-16 10:35:00 2010-11-16 10:40:00    2010-11-16 10:45:00 2010-11-16 10:50:00    2010-11-16 10:55:00    

Now to get the result with interval with zero occurrences we just outer join both result sets.

    SELECT series.minute as interval,  coalesce(cnt.amnt,0) as count from         (        SELECT count(*) amnt,        to_timestamp(floor((extract('epoch' from timestamp_column) / 300 )) * 300)        AT TIME ZONE 'UTC' as interval_alias        from TABLE_NAME  group by interval_alias        ) cnt          RIGHT JOIN         (            SELECT generate_series(min(date_trunc('hour',timestamp_column)),        max(date_trunc('minute',timestamp_column)),'5m') as minute from TABLE_NAME         ) series   on series.minute = cnt.interval_alias 

The end result will include the series with all 5 minute intervals even those that have no values.

interval             count -------------------  ----   2010-11-16 10:30:00  2 2010-11-16 10:35:00  10 2010-11-16 10:40:00  0 2010-11-16 10:45:00  8 2010-11-16 10:50:00  0  2010-11-16 10:55:00  11  

The interval can be easily changed by adjusting the last parameter of generate_series. In our case we use '5m' but it could be any interval we want.

like image 38
Nestor Martinez Avatar answered Sep 18 '22 17:09

Nestor Martinez