Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NULL value count in group by

for simplification purposes, I will use simple table attribute (meaning the table is bigger) to demonstrate the issue:

I have the following table test:

 id | nbr
----+-----
  1 |   0
  2 |
  3 |
  4 |   1
  5 |   1
 (5 rows)

id and nbr are both numeric values

The following query

select nbr, count(nbr) from test group by nbr;

outputs:

 nbr | count
-----+-------
     |     0
   1 |     2
   0 |     1
(3 rows)

whereas the query:

select nbr, count(*) from test group by nbr;

outputs:

 nbr | count
-----+------
     |     2
   1 |     2
   0 |     1
 (3 rows)

I find it hard to explain the difference between count(nbr) and count(*) regarding null values can someone explain this to me like I'm five, thanks

like image 236
rachid el kedmiri Avatar asked Sep 29 '17 16:09

rachid el kedmiri


People also ask

Does GROUP BY COUNT NULLs?

GROUP BY does treat all NULL values equally.

How do you handle NULL values in GROUP BY?

If the grouping column contains a null value, that row becomes its own group in the results. If the grouping column contains more than one null value, all null values form a single group.

Does GROUP BY ignore NULL values?

Group functions ignore the NULL values in the column. To enforce the group functions ti include the NULL value, use NVL function.

Are NULLs counted in COUNT?

COUNT does not include NULL values in column counts. Therefore, the number of return values for each column might differ or be less than the total number of rows returned by COUNT(*).


3 Answers

It's pretty simple:

count(<expression>) counts the number of values. Like most aggregate functions, it removes null values before doing the actual aggregation.

count(*) is a special case that counts the number of rows (regardless of any null).

count (no matter if * or <expression>) never returns null (unlike most other aggregate functions). In case no rows are aggregated, the result is 0.

Now, you have done a group by on an nullable column. group by put's null values into the same group. That means, the group for nbr null has two rows. If you now apply count(nbr), the null values are removed before aggregation, giving you 0 as result.

If you would do count(id), there would be no null value to be removed, giving you 2.

This is standard SQL behavior and honored by pretty much every database.

One of the common use-cases is to emulate the filter clause in databases that don't support it natively: http://modern-sql.com/feature/filter#conforming-alternatives

The exceptions (aggregate functions that don't remove null prior to aggregation) are functions like json_arrayagg, json_objectagg, array_agg and the like.

like image 83
Markus Winand Avatar answered Oct 16 '22 09:10

Markus Winand


MySQL explains it in the documentation of function COUNT():

COUNT(expr)

Returns a count of the number of non-NULL values of expr in the rows retrieved by a SELECT statement.

COUNT(*) is somewhat different in that it returns a count of the number of rows retrieved, whether or not they contain NULL values.


PostgreSQL also explains it in the documentation:

Most aggregate functions ignore null inputs, so that rows in which one or more of the expression(s) yield null are discarded. This can be assumed to be true, unless otherwise specified, for all built-in aggregates.

For example, count(*) yields the total number of input rows; count(f1) yields the number of input rows in which f1 is non-null, since count ignores nulls; and count(distinct f1) yields the number of distinct non-null values of f1.

like image 42
axiac Avatar answered Oct 16 '22 09:10

axiac


count(*) count the number of rows related to the group by colums. Inpependntly of the fatc the the column in group by contain null or not null values

count(nbr) count the number of rows related to the group by column where nbr is not null

like image 4
ScaisEdge Avatar answered Oct 16 '22 09:10

ScaisEdge