Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Querying DAU/MAU over time (daily)

Tags:

sql

postgresql

I have a daily sessions table with columns user_id and date. I'd like to graph out DAU/MAU (daily active users / monthly active users) on a daily basis. For example:

Date         MAU      DAU     DAU/MAU
2014-06-01   20,000   5,000   20%
2014-06-02   21,000   4,000   19%
2014-06-03   20,050   3,050   17%
...          ...      ...     ...

Calculating daily actives is straightforward to calculate, but calculating the monthly actives e.g. the number of users that logged in the date-30 days, is causing problems. How is this achieved without a left join for each day?

Edit: I'm using Postgres.

like image 452
David Bailey Avatar asked Jun 30 '14 16:06

David Bailey


People also ask

How are daily active users calculated?

How is DAU calculated? DAU, or Daily Active Users, is calculated by counting the unique number of daily users who are tagged as "active" in your product. You can also calculate the average DAU by dividing the total DAU for the month by the number of days in the month.

What SQL commands are used to determine the daily sessions per day within a selected month?

The Outer Query The datediff() function returns the number of days between the first day of the month to the first day of the following month.


2 Answers

Assuming you have values for each day, you can get the total counts using a subquery and range between:

with dau as (
      select date, count(userid) as dau
      from dailysessions ds
      group by date
     )
select date, dau,
       sum(dau) over (order by date rows between -29 preceding and current row) as mau
from dau;

Unfortunately, I think you want distinct users rather than just user counts. That makes the problem much more difficult, especially because Postgres doesn't support count(distinct) as a window function.

I think you have to do some sort of self join for this. Here is one method:

with dau as (
      select date, count(distinct userid) as dau
      from dailysessions ds
      group by date
     )
select date, dau,
       (select count(distinct user_id)
        from dailysessions ds
        where ds.date between date - 29 * interval '1 day' and date
       ) as mau
from dau;
like image 81
Gordon Linoff Avatar answered Sep 18 '22 14:09

Gordon Linoff


This one uses COUNT DISTINCT to get the rolling 30 days DAU/MAU:

(calculating reddit's user engagement in BigQuery - but the SQL is standard enough to be used on other databases)

SELECT day, dau, mau, INTEGER(100*dau/mau) daumau
FROM (
  SELECT day, EXACT_COUNT_DISTINCT(author) dau, FIRST(mau) mau
  FROM (
    SELECT DATE(SEC_TO_TIMESTAMP(created_utc)) day, author
    FROM [fh-bigquery:reddit_comments.2015_09]
    WHERE subreddit='AskReddit') a
  JOIN (
    SELECT stopday, EXACT_COUNT_DISTINCT(author) mau
    FROM (SELECT created_utc, subreddit, author FROM [fh-bigquery:reddit_comments.2015_09], [fh-bigquery:reddit_comments.2015_08]) a
    CROSS JOIN (
      SELECT DATE(SEC_TO_TIMESTAMP(created_utc)) stopday
      FROM [fh-bigquery:reddit_comments.2015_09]
      GROUP BY 1
    ) b
    WHERE subreddit='AskReddit'
    AND SEC_TO_TIMESTAMP(created_utc) BETWEEN DATE_ADD(stopday, -30, 'day') AND TIMESTAMP(stopday)
    GROUP BY 1
  ) b
  ON a.day=b.stopday
  GROUP BY 1
)
ORDER BY 1

I went further at How to calculate DAU/MAU with BigQuery (engagement)

like image 32
Felipe Hoffa Avatar answered Sep 20 '22 14:09

Felipe Hoffa