This is my postgres table
name | revenue
--------+---------
John | 100
Will | 100
Tom | 100
Susan | 100
Ben |
(5 rows)
Here, when I calculate average for revenue, It returns 100, which is clearly not the case and sum/count, which is 400/5 is 80. Is this behaviour by conventional design or am I missing the point?
I know I could change null to 0 and process as usual . But, given the default behaviour, is this intentional and preferred way of calculating average.
The AVG function works with numeric values and ignores NULL values.
AVG() function does not consider the NULL values during its calculation.
So you can use AVERAGE and divide by the number of columns >=0 (to not take into account the null columns), for that you can use COUNTIF (depedns of the range) or if you can't do it and if there are not much columns you can modified the /4 for something like sum(if(column1>=0,1,0),if(column2>=0,1,0).....)
PostgreSQL provides an AVG() function to calculate the average value of a set. The AVG() function is one of the most frequently used aggregate functions in PostgreSQL. The AVG() function enables users to calculate the average value of a numeric column. It can be used with both the SELECT and HAVING clause.
PostgreSQL In PostgreSQL, NULL means no value. In other words, the NULL column does not have any value. It does not equal 0, empty string, or spaces.
To calculate moving average in PostgreSQL, we first sort the rows chronologically using ORDER BY clause. Then we define our Window for calculating average, using ROWS BETWEEN 4 PRECEDING AND CURRENT ROW. It means that for each row calculate average for only the current row and preceding 4 rows.
The AVG () function is one of the most commonly used aggregate functions in PostgreSQL. The AVG () function allows you to calculate the average value of a set. You can use the AVG () function in the SELECT and HAVING clauses. Let’s take a look at some examples of using the AVG function.
All we can know right now is that the average of known values is 400 / 4 = 100. If you actually know that you have 0 revenue for this item, you should store 0 in that column. If you don't know what revenue you have for that item, you should exclude it from your calculations, which is what Postgres, following the SQL Standard, does for you.
This is both intentional and perfectly logical. Remember that NULL
means that the value is unknown.
It might, for instance, represent a value which will be filled in at some future date. If the future value turns out to be 0
, the average will be 400 / 5 = 80
, as you say; but if the future value turns out to be 200
, the average value will be 600 / 5 = 120
instead. All we can know right now is that the average of known values is 400 / 4 = 100
.
If you actually know that you have 0
revenue for this item, you should store 0
in that column. If you don't know what revenue you have for that item, you should exclude it from your calculations, which is what Postgres, following the SQL Standard, does for you.
If you can't fix the data, but it is in fact a case that all NULL
s in this table should be treated as 0
- or as some other fixed value - you can use a COALESCE
inside the aggregate:
SELECT AVG(COALESCE(revenue, 0)) as forced_average
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With