Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregating data by timespan in MySQL

Basically I want is to aggregate some values in a table according to a timespan.

What I do is, I take snapshots of a system every 15 minutes and I want to be able to draw some graph over a long period. Since the graphs get really confusing if too many points are shown (besides getting really slow to render) I want to reduce the number of points by aggregating multiple points into a single point by averaging over them.

For this I'd have to be able to group by buckets that can be defined by me (daily, weekly, monthly, yearly, ...) but so far all my experiments had no luck at all.

Is there some trick I can apply to do so?

like image 826
cdecker Avatar asked Dec 10 '09 02:12

cdecker


People also ask

How do you aggregate data?

In order to aggregate data, you can simply use Pivot table or other charts, which aggregate the data by the column assigned Row/Column (Pivot) or X-Axis (Bar/Line/other charts). But sometimes, you want to aggregate the data itself, not as how it's presented.

What is data aggregation with example?

For example, raw data can be aggregated over a given time period to provide statistics such as average, minimum, maximum, sum, and count. After the data is aggregated and written to a view or report, you can analyze the aggregated data to gain insights about particular resources or resource groups.

What is the benefit of aggregating data?

Data aggregation helps summarize data from different, disparate and multiple sources. It increases the value of information. The best data integration platforms can track the origin of the data and establish an audit trail. You can trace back to where the data was aggregated from.

What does it mean to aggregate data in a query?

An aggregate query is a method of deriving group and subgroup data by analysis of a set of individual data entries. The term is frequently used by database developers and database administrators.


2 Answers

I had a similar question: collating-stats-into-time-chunks and had it answered very well. In essence, the answer was:

Perhaps you can use the DATE_FORMAT() function, and grouping. Here's an example, hopefully you can adapt to your precise needs.

SELECT
    DATE_FORMAT( time, "%H:%i" ),
    SUM( bytesIn ),
    SUM( bytesOut )
FROM
    stats
WHERE
    time BETWEEN <start> AND <end>
GROUP BY
    DATE_FORMAT( time, "%H:%i" )

If your time window covers more than one day and you use the example format, data from different days will be aggregated into 'hour-of-day' buckets. If the raw data doesn't fall exactly on the hour, you can smooth it out by using "%H:00."

Thanks be to martin clayton for the answer he provided me.

like image 171
cmroanirgo Avatar answered Sep 24 '22 02:09

cmroanirgo


It's easy to truncate times to the last 15 minutes (for example), by doing something like:

SELECT dateadd(minute, datediff(minute, '20000101', yourDateTimeField) / 15 * 15, '20000101') AS the15minuteBlock, COUNT(*) as Cnt
FROM yourTable
GROUP BY dateadd(minute, datediff(minute, '20000101', yourDateTimeField) / 15 * 15, '20000101');

Use similar truncation methods to group by hour, week, whatever.

You could always wrap it up in a CASE statement to handle multiple methods, using:

GROUP BY CASE @option WHEN 'week' THEN dateadd(week, .....
like image 28
Rob Farley Avatar answered Sep 26 '22 02:09

Rob Farley