Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate moving average using SQL window functions with leading null's where not enough data is avaliable

I want to calculate a moving average using SQL window functions. The following example of a 2 "day" moving average basically works fine, but It also calculates an average if only one data point is available. I rather want the average to be null as long as not enough data is available

create table average(
    nr int,
    value float
);

insert into average values (1, 2), (2, 4), (3, 6), (3, 8), (4, 10);

SELECT
    nr, 
    value, 
    AVG(value) OVER (ORDER BY nr ROWS BETWEEN 1 PRECEDING AND 0 FOLLOWING)::FLOAT AS "Moving-Average-2"
FROM average;

result:

1   2   2
2   4   3
3   6   5
3   8   7
4   10  9

expected result:

1   2   null
2   4   3
3   6   5
3   8   7
4   10  9

EDIT 1: Of course the average can be anything not only 2.

like image 639
KIC Avatar asked Apr 08 '19 09:04

KIC


People also ask

Can we use Windows function in where clause?

You can't use window functions in WHERE , because the logical order of operations in an SQL query is completely different from the SQL syntax. The logical order of operations in SQL is: FROM, JOIN. WHERE.

How do I find the average of each row in SQL?

For example, 2+4+4+6+6+8 is 30 divided 6 which results in an average of 5. This is the basic syntax for the AVG function: SELECT AVG(column_name) FROM table_name; In this example, we have a table called students , with columns of id , name , date , and scores .

What is a windowed function SQL?

A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function.


3 Answers

You could use another window function (COUNT()) to make sure that at least two records are available in the window before doing the computation, like:

SELECT
    nr, 
    value, 
    CASE WHEN COUNT(*) OVER(ORDER BY nr ROWS BETWEEN 1 PRECEDING AND 0 FOLLOWING) > 1
        THEN AVG(value) OVER (ORDER BY nr ROWS BETWEEN 1 PRECEDING AND 0 FOLLOWING)::FLOAT 
        ELSE NULL
    END AS "Moving-Average-2"
FROM average;

Demo on DB Fiddle:

| nr  | value | Moving-Average-2 |
| --- | ----- | ---------------- |
| 1   | 2     |                  |
| 2   | 4     | 3                |
| 3   | 6     | 5                |
| 3   | 8     | 7                |
| 4   | 10    | 9                |
like image 139
GMB Avatar answered Oct 19 '22 15:10

GMB


Since you happen to form the average only between 1 preceding row and the current one, just using lag() might be simplest:

select nr, value
     ,(value + lag(value, 1, NULL) OVER (ORDER BY nr)) / 2 AS "Moving-Average-2"
from average;

lag() has an overloaded variant that allows to provide a default value (as 3rd parameter) in case there is no row. Provide NULL and you are there. Or, since NULL is the default default anyway, just:

 ... ,(value + lag(value) OVER (ORDER BY nr)) / 2 AS "Moving-Average-2"

While the underlying table column is of type float, you need no cast to float in this case.

This is assuming the column value is defined NOT NULL (like indicated by your sample data). Else you also get NULL where the previous row has value IS NULL and the current row has a value, while avg() returns the value in this case! (Or this may be what you want anyway, given your question.)

like image 21
Erwin Brandstetter Avatar answered Oct 19 '22 16:10

Erwin Brandstetter


This may be a handy place to use a window specification:

select a.*,
       (case when row_number() over w > 1
             then avg(value) over w
        end) as running_average
from average a
window w as (order by nr rows between 1 preceding and current row);
like image 2
Gordon Linoff Avatar answered Oct 19 '22 15:10

Gordon Linoff