I want to save event time and the total amount of generated electric per 30 seconds. The total amount is not reseted to zero everytime. It's just the total from the meter first started to now, not total amount generated in the 30 seconds.
Is there any way of querying daily, weekly or monthly aggregations on the total amount of generated electric column (Maybe not just sum or avg)?
Or by design a AggregatingMergeTree
table?
I don't need to keep every record, just need the daily, weekly and monthly aggregations.
For example :
create table meter_record (
event_time Datetime,
generated_total Int64
)
UPDATE
Prefer to use SimpleAggregateFunction instead of AggregateFunction for simple functions like median, avg, min, max to speed up aggregates calculation.
Let's suggest you need to calculate median, average and dispersion aggregations for this table:
CREATE TABLE meter_record (
event_time Datetime,
generated_total Int64
)
ENGINE = MergeTree
PARTITION BY (toYYYYMM(event_time))
ORDER BY (event_time);
Use AggregatingMergeTree to calculate required aggregates:
CREATE MATERIALIZED VIEW meter_aggregates_mv
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(day)
ORDER BY (day)
AS
SELECT
toDate(toStartOfDay(event_time)) AS day,
/* aggregates to calculate the day's section left and right endpoints */
minState(generated_total) min_generated_total,
maxState(generated_total) max_generated_total,
/* specific aggregates */
medianState(generated_total) AS totalMedian,
avgState(generated_total) AS totalAvg,
varPopState(generated_total) AS totalDispersion
/* ... */
FROM meter_record
GROUP BY day;
To get required daily / weekly / montly (and any day-base aggregation like quarterly or yearly) aggregates use these queries:
/* daily report */
SELECT
day,
minMerge(min_generated_total) min_generated_total,
maxMerge(max_generated_total) max_generated_total,
medianMerge(totalMedian) AS totalMedian,
avgMerge(totalAvg) AS totalAvg,
varPopMerge(totalDispersion) AS totalDispersion
FROM meter_aggregates_mv
/*WHERE day >= '2019-02-05' and day < '2019-07-01'*/
GROUP BY day;
/* weekly report */
SELECT
toStartOfWeek(day, 1) monday,
minMerge(min_generated_total) min_generated_total,
maxMerge(max_generated_total) max_generated_total,
medianMerge(totalMedian) AS totalMedian,
avgMerge(totalAvg) AS totalAvg,
varPopMerge(totalDispersion) AS totalDispersion
FROM meter_aggregates_mv
/*WHERE day >= '2019-02-05' and day < '2019-07-01'*/
GROUP BY monday;
/* monthly report */
SELECT
toStartOfMonth(day) month,
minMerge(min_generated_total) min_generated_total,
maxMerge(max_generated_total) max_generated_total,
medianMerge(totalMedian) AS totalMedian,
avgMerge(totalAvg) AS totalAvg,
varPopMerge(totalDispersion) AS totalDispersion
FROM meter_aggregates_mv
/*WHERE day >= '2019-02-05' and day < '2019-07-01'*/
GROUP BY month;
/* get daily / weekly / monthly reports in one query (thanks @Denis Zhuravlev for advise) */
SELECT
day,
toStartOfWeek(day, 1) AS week,
toStartOfMonth(day) AS month,
minMerge(min_generated_total) min_generated_total,
maxMerge(max_generated_total) max_generated_total,
medianMerge(totalMedian) AS totalMedian,
avgMerge(totalAvg) AS totalAvg,
varPopMerge(totalDispersion) AS totalDispersion
FROM meter_aggregates_mv
/*WHERE (day >= '2019-05-01') AND (day < '2019-06-01')*/
GROUP BY month, week, day WITH ROLLUP
ORDER BY day, week, month;
Remarks:
you point that raw-data is not required to you just aggregates, so you can set engine for meter_record-table as Null, manually clean meter_record (see DROP PARTITION) or define the TTL to do it automatically
removing raw-data is bad practice because it makes impossible to calculate new aggregates on historical data or restore exist aggregates etc
the materialized view meter_aggregates_mv will contains only the data inserted in the table meter_record after creating the view. To change this behavior use POPULATE in view definition
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With