for simplification purposes, I will use simple table attribute (meaning the table is bigger) to demonstrate the issue:
I have the following table test:
id | nbr
----+-----
1 | 0
2 |
3 |
4 | 1
5 | 1
(5 rows)
id and nbr are both numeric values
The following query
select nbr, count(nbr) from test group by nbr;
outputs:
nbr | count
-----+-------
| 0
1 | 2
0 | 1
(3 rows)
whereas the query:
select nbr, count(*) from test group by nbr;
outputs:
nbr | count
-----+------
| 2
1 | 2
0 | 1
(3 rows)
I find it hard to explain the difference between count(nbr) and count(*) regarding null values can someone explain this to me like I'm five, thanks
GROUP BY does treat all NULL values equally.
If the grouping column contains a null value, that row becomes its own group in the results. If the grouping column contains more than one null value, all null values form a single group.
Group functions ignore the NULL values in the column. To enforce the group functions ti include the NULL value, use NVL function.
COUNT does not include NULL values in column counts. Therefore, the number of return values for each column might differ or be less than the total number of rows returned by COUNT(*).
It's pretty simple:
count(<expression>)
counts the number of values. Like most aggregate functions, it removes null
values before doing the actual aggregation.
count(*)
is a special case that counts the number of rows (regardless of any null
).
count
(no matter if *
or <expression>
) never returns null
(unlike most other aggregate functions). In case no rows are aggregated, the result is 0
.
Now, you have done a group by
on an nullable column. group by
put's null
values into the same group. That means, the group for nbr
null
has two rows. If you now apply count(nbr)
, the null values are removed before aggregation, giving you 0
as result.
If you would do count(id)
, there would be no null
value to be removed, giving you 2
.
This is standard SQL behavior and honored by pretty much every database.
One of the common use-cases is to emulate the filter
clause in databases that don't support it natively: http://modern-sql.com/feature/filter#conforming-alternatives
The exceptions (aggregate functions that don't remove null
prior to aggregation) are functions like json_arrayagg
, json_objectagg
, array_agg
and the like.
MySQL explains it in the documentation of function COUNT()
:
COUNT(expr)
Returns a count of the number of non-
NULL
values of expr in the rows retrieved by aSELECT
statement.
COUNT(*)
is somewhat different in that it returns a count of the number of rows retrieved, whether or not they containNULL
values.
PostgreSQL also explains it in the documentation:
Most aggregate functions ignore
null
inputs, so that rows in which one or more of the expression(s) yield null are discarded. This can be assumed to be true, unless otherwise specified, for all built-in aggregates.For example,
count(*)
yields the total number of input rows;count(f1)
yields the number of input rows in whichf1
is non-null, sincecount
ignoresnull
s; andcount(distinct f1)
yields the number of distinct non-null
values off1
.
count(*)
count the number of rows related to the group by colums. Inpependntly of the fatc the the column in group by contain null or not null values
count(nbr)
count the number of rows related to the group by column where nbr is not null
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With