Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres Average calculation ignores null

Tags:

postgresql

This is my postgres table

 name   | revenue   
--------+---------  
 John   |     100  
 Will   |     100  
 Tom    |     100   
 Susan  |     100   
 Ben    |        
(5 rows)

Here, when I calculate average for revenue, It returns 100, which is clearly not the case and sum/count, which is 400/5 is 80. Is this behaviour by conventional design or am I missing the point?

I know I could change null to 0 and process as usual . But, given the default behaviour, is this intentional and preferred way of calculating average.

like image 872
Siva Black Jester Avatar asked Mar 29 '18 13:03

Siva Black Jester


People also ask

Does Avg ignore NULL values Postgres?

The AVG function works with numeric values and ignores NULL values.

Does Avg ignore NULL values?

AVG() function does not consider the NULL values during its calculation.

How do you find the average with NULL value?

So you can use AVERAGE and divide by the number of columns >=0 (to not take into account the null columns), for that you can use COUNTIF (depedns of the range) or if you can't do it and if there are not much columns you can modified the /4 for something like sum(if(column1>=0,1,0),if(column2>=0,1,0).....)

How does Postgres calculate average?

PostgreSQL provides an AVG() function to calculate the average value of a set. The AVG() function is one of the most frequently used aggregate functions in PostgreSQL. The AVG() function enables users to calculate the average value of a numeric column. It can be used with both the SELECT and HAVING clause.

What does null mean in PostgreSQL?

PostgreSQL In PostgreSQL, NULL means no value. In other words, the NULL column does not have any value. It does not equal 0, empty string, or spaces.

How to calculate moving average in PostgreSQL?

To calculate moving average in PostgreSQL, we first sort the rows chronologically using ORDER BY clause. Then we define our Window for calculating average, using ROWS BETWEEN 4 PRECEDING AND CURRENT ROW. It means that for each row calculate average for only the current row and preceding 4 rows.

What is the use of Avg() function in PostgreSQL?

The AVG () function is one of the most commonly used aggregate functions in PostgreSQL. The AVG () function allows you to calculate the average value of a set. You can use the AVG () function in the SELECT and HAVING clauses. Let’s take a look at some examples of using the AVG function.

What is the average of known values in PostgreSQL?

All we can know right now is that the average of known values is 400 / 4 = 100. If you actually know that you have 0 revenue for this item, you should store 0 in that column. If you don't know what revenue you have for that item, you should exclude it from your calculations, which is what Postgres, following the SQL Standard, does for you.


1 Answers

This is both intentional and perfectly logical. Remember that NULL means that the value is unknown.

It might, for instance, represent a value which will be filled in at some future date. If the future value turns out to be 0, the average will be 400 / 5 = 80, as you say; but if the future value turns out to be 200, the average value will be 600 / 5 = 120 instead. All we can know right now is that the average of known values is 400 / 4 = 100.

If you actually know that you have 0 revenue for this item, you should store 0 in that column. If you don't know what revenue you have for that item, you should exclude it from your calculations, which is what Postgres, following the SQL Standard, does for you.

If you can't fix the data, but it is in fact a case that all NULLs in this table should be treated as 0 - or as some other fixed value - you can use a COALESCE inside the aggregate:

 SELECT AVG(COALESCE(revenue, 0)) as forced_average
like image 171
IMSoP Avatar answered Dec 06 '22 03:12

IMSoP