Given a table in Google BigQuery:
User Timestamp
A TIMESTAMP(12/05/2015 12:05:01.8023)
B TIMESTAMP(9/29/2015 12:15:01.0323)
B TIMESTAMP(9/29/2015 13:05:01.0233)
A TIMESTAMP(9/29/2015 14:05:01.0432)
C TIMESTAMP(8/15/2015 5:05:01.0000)
B TIMESTAMP(9/29/2015 14:06:01.0233)
A TIMESTAMP(9/29/2015 14:06:01.0432)
Is there a simple way to compute:
User Maximum_Number_of_Events_this_User_Had_in_One_Hour
A 2
B 3
C 1
where the time window of one hour is a parameter?
I tried doing this myself using a combination of LAG and partition functions by building off of these two questions:
BigQuery SQL for 28-day sliding window aggregate (without writing 28 lines of SQL)
Bigquery SQL for sliding window aggregate
But find those posts are too dissimilar since I am not finding number of people per time window, but instead finding max number of events per person within a time window.
Here is an efficient succinct way to do it that exploits the ordered structure of timestamps.
SELECT
user,
MAX(per_hour) AS max_event_per_hour
FROM
(
SELECT
user,
COUNT(*) OVER (PARTITION BY user ORDER BY timestamp RANGE BETWEEN 60 * 60 * 1000000 PRECEDING AND CURRENT ROW) as per_hour,
timestamp
FROM
[dataset_example_in_question_user_timestamps]
)
GROUP BY user
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With