Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL getting daily, weekly, and monthly averages of the occurrences of an event in one query

Currently I have this rather large query that works by

  1. Aggregating the daily, weekly, monthly counts into intermediate tables by taking the count() of an event grouped by the event name and the date.
  2. Selecting the avg count over each intermediate table by doing avg() group by just event, doing a union of the results, and because I want to have a separate column for daily, weekly, monthly, putting a filler value of 0 into empty columns.
  3. I then sum over all the columns, and the 0s basically act as a no-op, which gives me just a single value for each event.

The query is pretty large though, and I feel like I'm doing a lot of repetitive work. Is there any way to do this query better or make it smaller? I haven't really done queries like this before so I'm not quite sure.

WITH monthly_counts as (
  SELECT
    event,
    count(*) as count
  FROM tracking_stuff
  WHERE
    event = 'thing'
    OR event = 'thing2'
    OR event = 'thing3'
  GROUP BY event, date_trunc('month', created_at)
),
weekly_counts as (
  SELECT
    event,
    count(*) as count
  FROM tracking_stuff
  WHERE
    event = 'thing'
    OR event = 'thing2'
    OR event = 'thing3'
  GROUP BY event, date_trunc('week', created_at)
),
daily_counts as (
  SELECT
    event,
    count(*) as count
  FROM tracking_stuff
  WHERE
    event = 'thing'
    OR event = 'thing2'
    OR event = 'thing3'
  GROUP BY event, date_trunc('day', created_at)
),
query as (
  SELECT
    event,
    0 as daily_avg,
    0 as weekly_avg,
    avg(count) as monthly_avg
  FROM monthly_counts
  GROUP BY event
  UNION
  SELECT
    event,
    0 as daily_avg,
    avg(count) as weekly_avg,
    0 as monthly_avg
  FROM weekly_counts
  GROUP BY event
  UNION
  SELECT
    event,
    avg(count) as daily_avg,
    0 as weekly_avg,
    0 as monthly_avg
  FROM daily_counts
  GROUP BY event
)
SELECT
  event,
  sum(daily_avg) as daily_avg,
  sum(weekly_avg) as weekly_avg,
  sum(monthly_avg) as monthly_avg
FROM query
GROUP BY event;
like image 639
m0meni Avatar asked Jul 06 '16 14:07

m0meni


1 Answers

I'd write the query in a way like this:

select event, daily_avg, weekly_avg, monthly_avg
from (
    select event, avg(count) monthly_avg
    from (
        select event, count(*)
        from tracking_stuff
        where event in ('thing1', 'thing2', 'thing3')
        group by event, date_trunc('month', created_at)
    ) s
    group by 1
) monthly
join (
    select event, avg(count) weekly_avg
    from (
        select event, count(*)
        from tracking_stuff
        where event in ('thing1', 'thing2', 'thing3')
        group by event, date_trunc('week', created_at)
    ) s
    group by 1
) weekly using(event)
join (
    select event, avg(count) daily_avg
    from (
        select event, count(*)
        from tracking_stuff
        where event in ('thing1', 'thing2', 'thing3')
        group by event, date_trunc('day', created_at)
    ) s
    group by 1
) daily using(event)
order by 1;

If the where condition eliminates a significant portion of the data (say more than a half) the use of cte could slightly speed up the query execution:

with the_data as (
    select event, created_at
    from tracking_stuff
    where event in ('thing1', 'thing2', 'thing3')
    )

select event, daily_avg, weekly_avg, monthly_avg
from (
    select event, avg(count) monthly_avg
    from (
        select event, count(*)
        from the_data
        group by event, date_trunc('month', created_at)
    ) s
    group by 1
) monthly
--  etc ... 

Just for curiosity I've done a test on the data:

create table tracking_stuff (event text, created_at timestamp);
insert into tracking_stuff
    select 'thing' || random_int(9), '2016-01-01'::date+ random_int(365)
    from generate_series(1, 1000000);

In every query I've replaced thing with thing1, so the queries eliminate about 2/3 of rows.

Average execution time of 10 tests:

Original query          1106 ms
My query without cte    1077 ms
My query with cte        902 ms
Clodoaldo's query       5187 ms
like image 154
klin Avatar answered Sep 22 '22 08:09

klin