I want to calculate a moving average using SQL window functions. The following example of a 2 "day" moving average basically works fine, but It also calculates an average if only one data point is available. I rather want the average to be null as long as not enough data is available
create table average(
nr int,
value float
);
insert into average values (1, 2), (2, 4), (3, 6), (3, 8), (4, 10);
SELECT
nr,
value,
AVG(value) OVER (ORDER BY nr ROWS BETWEEN 1 PRECEDING AND 0 FOLLOWING)::FLOAT AS "Moving-Average-2"
FROM average;
result:
1 2 2
2 4 3
3 6 5
3 8 7
4 10 9
expected result:
1 2 null
2 4 3
3 6 5
3 8 7
4 10 9
EDIT 1: Of course the average can be anything not only 2.
You can't use window functions in WHERE , because the logical order of operations in an SQL query is completely different from the SQL syntax. The logical order of operations in SQL is: FROM, JOIN. WHERE.
For example, 2+4+4+6+6+8 is 30 divided 6 which results in an average of 5. This is the basic syntax for the AVG function: SELECT AVG(column_name) FROM table_name; In this example, we have a table called students , with columns of id , name , date , and scores .
A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function.
You could use another window function (COUNT()
) to make sure that at least two records are available in the window before doing the computation, like:
SELECT
nr,
value,
CASE WHEN COUNT(*) OVER(ORDER BY nr ROWS BETWEEN 1 PRECEDING AND 0 FOLLOWING) > 1
THEN AVG(value) OVER (ORDER BY nr ROWS BETWEEN 1 PRECEDING AND 0 FOLLOWING)::FLOAT
ELSE NULL
END AS "Moving-Average-2"
FROM average;
Demo on DB Fiddle:
| nr | value | Moving-Average-2 |
| --- | ----- | ---------------- |
| 1 | 2 | |
| 2 | 4 | 3 |
| 3 | 6 | 5 |
| 3 | 8 | 7 |
| 4 | 10 | 9 |
Since you happen to form the average only between 1 preceding row and the current one, just using lag()
might be simplest:
select nr, value
,(value + lag(value, 1, NULL) OVER (ORDER BY nr)) / 2 AS "Moving-Average-2"
from average;
lag()
has an overloaded variant that allows to provide a default value (as 3rd parameter) in case there is no row. Provide NULL
and you are there. Or, since NULL
is the default default anyway, just:
... ,(value + lag(value) OVER (ORDER BY nr)) / 2 AS "Moving-Average-2"
While the underlying table column is of type float
, you need no cast to float
in this case.
This is assuming the column value is defined NOT NULL
(like indicated by your sample data). Else you also get NULL
where the previous row has value IS NULL
and the current row has a value, while avg()
returns the value in this case! (Or this may be what you want anyway, given your question.)
This may be a handy place to use a window specification:
select a.*,
(case when row_number() over w > 1
then avg(value) over w
end) as running_average
from average a
window w as (order by nr rows between 1 preceding and current row);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With