I have a table in BigQuery that looks like this:
Caller_Number | month | day| call_time
--------------|--------|-----|----------
1 | 5 | 15 | 12:56:17
I want to write a SQL query for BigQuery that will allow me to count the consecutive hours in which at least one call was made (sorted by caller_number), and the consecutive days in which at least 10 consecutive hours with calls occurred (sorter by caller_number). I have been looking at the existing resources on gaps and islands, but can't seem to figure out how to apply it to consecutive dates and hours.
Below is working example for consecutive hours
Steps are
1.“Extract” hour from call_time
HOUR(TIMESTAMP(CURRENT_DATE() + ' ' + call_time))
2.Find previous hour
LAG([hour]) OVER(PARTITION BY Caller_Number, [month], [day] ORDER BY [hour])
3.Calculate start of group of consecutive hours – 1 – start, 0 – group continuation
IFNULL(INTEGER([hour] - prev_hour > 1), 1)
4.Assign group number to each group
SUM(seq) OVER(PARTITION BY Caller_Number, [month], [day] ORDER BY [hour])
5.And finally – group by group number and count calls and hours
Hope this gives you good start for implementing similar logic for consecutive days on top of consec hours result
SELECT Caller_Number, [month], [day], seq_group,
EXACT_COUNT_DISTINCT([hour]) AS hours_count, COUNT(1) AS calls_count
FROM (
SELECT Caller_Number, [month], [day], [hour],
SUM(seq) OVER(PARTITION BY Caller_Number, [month], [day]
ORDER BY [hour]) AS seq_group
FROM (
SELECT Caller_Number, [month], [day], [hour],
IFNULL(INTEGER([hour] - prev_hour > 1), 1) AS seq
FROM (
SELECT Caller_Number, [month], [day], [hour],
LAG([hour]) OVER(PARTITION BY Caller_Number, [month], [day]
ORDER BY [hour]) AS prev_hour
FROM (
SELECT Caller_Number, [month], [day],
HOUR(TIMESTAMP(CURRENT_DATE() + ' ' + call_time)) AS [hour]
FROM YourTable
)
)
)
)
GROUP BY Caller_Number, [month], [day], seq_group
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With