I have a simple table as bellow with lots of IDs and dates.
ID Date
10R46 2014-11-23
10R46 2016-04-11
100R9 2016-12-21
10R91 2013-05-03
... ...
I want to formulate a query which counts the unique IDs for a rolling time frame of dates, for example ten days. Meaning that for each date it should give me the number of unique IDs between that date and 10 days back. Result should look something like this.
UniqueTenDays Date
200 2014-11-23
324 2014-11-24
522 2014-11-25
532 2014-11-26
... ...
Something along the lines of the bellow but I realise I need to apply the WHERE clause and count the IDs for each Date somehow.
SELECT Date, COUNT(DISTINCT ID)
FROM T
WHERE Date BETWEEN DATE_SUB(Date, INTERVAL 10 DAY) AND Date
GROUP BY Date
ORDER BY Date
Thanks in advance.
Below is for BigQuery Standard SQL
#standardSQL
WITH temp1 AS (
SELECT dt, STRING_AGG(DISTINCT id) AS users
FROM `project.dataset.yourtable`
GROUP BY dt
), temp2 AS (
SELECT
dt,
STRING_AGG(users) OVER(ORDER BY UNIX_DATE(dt) RANGE BETWEEN 10 PRECEDING AND CURRENT ROW) users
FROM temp1
)
SELECT dt,
(SELECT COUNT(DISTINCT id) FROM UNNEST(SPLIT(users)) AS id) UniqueTenDays
FROM temp2
you can test / play with it using dummy data as below
#standardSQL
WITH `project.dataset.yourtable` AS (
SELECT '10R46' id, DATE '2014-11-23' dt UNION ALL
SELECT '10R46', DATE '2016-04-11' UNION ALL
SELECT '10R46', DATE '2016-04-12' UNION ALL
SELECT '10R47', DATE '2016-04-13' UNION ALL
SELECT '10R48', DATE '2016-04-14' UNION ALL
SELECT '100R9', DATE '2016-12-21' UNION ALL
SELECT '10R91', DATE '2013-05-03'
), temp1 AS (
SELECT dt, STRING_AGG(DISTINCT id) AS users
FROM `project.dataset.yourtable`
GROUP BY dt
), temp2 AS (
SELECT
dt,
STRING_AGG(users) OVER(ORDER BY UNIX_DATE(dt) RANGE BETWEEN 10 PRECEDING AND CURRENT ROW) users
FROM temp1
)
SELECT dt,
(SELECT COUNT(DISTINCT id) FROM UNNEST(SPLIT(users)) AS id) UniqueTenDays
FROM temp2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With