I'm trying to compute a 28 day moving sum in BigQuery using the LAG function.
The top answer to this question
Bigquery SQL for sliding window aggregate
from Felipe Hoffa indicates that that you can use the LAG function. An example of this would be:
SELECT
spend + spend_lagged_1day + spend_lagged_2day + spend_lagged_3day + ... + spend_lagged_27day as spend_28_day_sum,
user,
date
FROM (
SELECT spend,
LAG(spend, 1) OVER (PARTITION BY user ORDER BY date) spend_lagged_1day,
LAG(spend, 2) OVER (PARTITION BY user ORDER BY date) spend_lagged_2day,
LAG(spend, 3) OVER (PARTITION BY user ORDER BY date) spend_lagged_3day,
...
LAG(spend, 28) OVER (PARTITION BY user ORDER BY date) spend_lagged_day,
user,
date
FROM user_spend
)
Is there a way to do this without having to write out 28 lines of SQL!
Bigquery: How to get a rolling time range in a window clause.....
This is an old post, but I spend a long time searching for a solution, and this post came up so maybe this will help someone.
IF your partition of your window clause does not have a record for every day, you need to use the RANGE clause to accurately get a rolling time range, (ROWS would search the number records, which would to go too far back since you don't have a record for every day in your PARTITION BY). The problem is that in Bigquery RANGE clause does not support Dates.
From BigQuery's documentation:
numeric_expression must have numeric type. DATE and TIMESTAMP are not currently supported. In addition, the numeric_expression must be a constant, non-negative integer or a parameter.
The workaround I found was to use the UNIX_DATE(date_expression) in the ORDER BY clause along with a RANGE clause:
SUM(value) OVER (PARTITION BY Column1 ORDER BY UNIX_DATE(Date) RANGE BETWEEN 5 PRECEDING AND CURRENT ROW
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With