Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to do rolling averages in Big Query?

I know there is an AVG function in Big Query and there are window functions to shift previous or next values up or down a row, but is there any function that allows you to average over a specified interval? For instance I'd like to so something like the following:

SELECT
    city
    AVG(temperature) OVER(PARTITION BY city, INTERVAL day,14, ORDER BY day) as rolling_avg_14_days,
    AVG(temperature) OVER(PARTITION BY city, INTERVAL day,30, ORDER BY day) as rolling_avg_30_days,
WHERE
    city IN ("Los Angeles","Chicago","Sun Prairie","Sunnyvale")
    AND year BETWEEN 1900 AND 2013

I'd like to do a rolling average calculation that allows me to specify a range of values to do an aggregation function over, and what value to order by. The average function would take the current day temp and previous 13 days (or previous 29) to calculate and average. Is this possible today? I know I could do something like this if I put 13 LAG/OVER fields in the SELECT statement and then avg the results of all of them, but that is a lot of overhead.

like image 451
Steve Scherer Avatar asked Mar 06 '14 23:03

Steve Scherer


People also ask

How do you calculate rolling average in SQL?

The second part of the query (in black text) is the calculation of the rolling average. Similarly to the first example, we use the AVG() window function and the clause OVER(ORDER BY day ROWS BETWEEN 9 PRECEDING AND CURRENT ROW) . This applies the AVG() function to the current row and the nine rows before it.

How do you calculate rolling average?

A rolling average continuously updates the average of a data set to include all the data in the set until that point. For example, the rolling average of return quantities at March 2012 would be calculated by adding the return quantities in January, February, and March, and then dividing that sum by three.

How do you plot a 7 day rolling average?

A moving average means that it takes the past days of numbers, takes the average of those days, and plots it on the graph. For a 7-day moving average, it takes the last 7 days, adds them up, and divides it by 7. For a 14-day average, it will take the past 14 days.

How do you calculate 3 day moving average in SQL?

To do so, we calculate the average of the stock prices from three consecutive days—the day in question and the two previous days—then repeat the same for each day in the data set. This is a three-day moving average, because we average over a period of three days.


1 Answers

I think OVER with RANGE construction of Window Functions is best fit here

Assuming that day field is represented as 'YYYY-MM-DD' format, below query does the rolling averages

SELECT
  city,
  day,
  AVG(temperature) OVER(PARTITION BY city ORDER BY ts 
                RANGE BETWEEN 14*24*3600 PRECEDING AND CURRENT ROW) AS rolling_avg_14_days,
  AVG(temperature) OVER(PARTITION BY city ORDER BY ts 
                RANGE BETWEEN 30*24*3600 PRECEDING AND CURRENT ROW) AS rolling_avg_30_days
FROM (
  SELECT day, city, temperature, TIMESTAMP_TO_SEC(TIMESTAMP(day)) AS ts 
  FROM temperatures
)

You most likely already found this solution long ago, but still wanted to have what i think a better answer (as of today) here for this question

like image 85
Mikhail Berlyant Avatar answered Oct 14 '22 15:10

Mikhail Berlyant