for simplification purposes, I will use simple table attribute (meaning the table is bigger) to demonstrate the issue: I have the following table test: <pre class="prettyprint"><code> id | nbr ----+----- 1 | 0 2 | 3 | 4 | 1 5 | 1 (5 rows) </code></pre> id and nbr are both numeric values The following query <pre class="prettyprint"><code>select nbr, count(nbr) from test group by nbr; </code></pre> outputs: <pre class="prettyprint"><code> nbr | count -----+------- | 0 1 | 2 0 | 1 (3 rows) </code></pre> whereas the query: <pre class="prettyprint"><code>select nbr, count(*) from test group by nbr; </code></pre> outputs: <pre class="prettyprint"><code> nbr | count -----+------ | 2 1 | 2 0 | 1 (3 rows) </code></pre> I find it hard to explain the difference between count(nbr) and count(*) regarding null values can someone explain this to me like I'm five, thanks

It's pretty simple: <code>count(<expression>)</code> counts the number of values. Like most aggregate functions, it removes <code>null</code> values before doing the actual aggregation. <code>count(*)</code> is a special case that counts the number of rows (regardless of any <code>null</code>). <code>count</code> (no matter if <code>*</code> or <code><expression></code>) never returns <code>null</code> (unlike most other aggregate functions). In case no rows are aggregated, the result is <code>0</code>. Now, you have done a <code>group by</code> on an nullable column. <code>group by</code> put's <code>null</code> values into the same group. That means, the group for <code>nbr</code> <code>null</code> has two rows. If you now apply <code>count(nbr)</code>, the null values are removed before aggregation, giving you <code>0</code> as result. If you would do <code>count(id)</code>, there would be no <code>null</code> value to be removed, giving you <code>2</code>. This is standard SQL behavior and honored by pretty much every database. One of the common use-cases is to emulate the <code>filter</code> clause in databases that don't support it natively: http://modern-sql.com/feature/filter#conforming-alternatives The exceptions (aggregate functions that don't remove <code>null</code> prior to aggregation) are functions like <code>json_arrayagg</code>, <code>json_objectagg</code>, <code>array_agg</code> and the like.

MySQL explains it in the documentation of function <code>COUNT()</code>: <blockquote> <code>COUNT(expr)</code> Returns a count of the number of non-<code>NULL</code> values of expr in the rows retrieved by a <code>SELECT</code> statement. <code>COUNT(*)</code> is somewhat different in that it returns a count of the number of rows retrieved, whether or not they contain <code>NULL</code> values. </blockquote> <hr> PostgreSQL also explains it in the documentation: <blockquote> Most aggregate functions ignore <code>null</code> inputs, so that rows in which one or more of the expression(s) yield null are discarded. This can be assumed to be true, unless otherwise specified, for all built-in aggregates. For example, <code>count(*)</code> yields the total number of input rows; <code>count(f1)</code> yields the number of input rows in which <code>f1</code> is non-null, since <code>count</code> ignores <code>null</code>s; and <code>count(distinct f1)</code> yields the number of distinct non-<code>null</code> values of <code>f1</code>. </blockquote>

NULL value count in group by

Tags:

sql

mysql

postgresql

for simplification purposes, I will use simple table attribute (meaning the table is bigger) to demonstrate the issue:

I have the following table test:

 id | nbr
----+-----
  1 |   0
  2 |
  3 |
  4 |   1
  5 |   1
 (5 rows)

id and nbr are both numeric values

The following query

select nbr, count(nbr) from test group by nbr;

outputs:

 nbr | count
-----+-------
     |     0
   1 |     2
   0 |     1
(3 rows)

whereas the query:

select nbr, count(*) from test group by nbr;

outputs:

 nbr | count
-----+------
     |     2
   1 |     2
   0 |     1
 (3 rows)

I find it hard to explain the difference between count(nbr) and count(*) regarding null values can someone explain this to me like I'm five, thanks

236

asked Sep 29 '17 16:09

rachid el kedmiri

3 Answers

It's pretty simple:

count(<expression>) counts the number of values. Like most aggregate functions, it removes null values before doing the actual aggregation.

count(*) is a special case that counts the number of rows (regardless of any null).

count (no matter if * or <expression>) never returns null (unlike most other aggregate functions). In case no rows are aggregated, the result is 0.

Now, you have done a group by on an nullable column. group by put's null values into the same group. That means, the group for nbr null has two rows. If you now apply count(nbr), the null values are removed before aggregation, giving you 0 as result.

If you would do count(id), there would be no null value to be removed, giving you 2.

This is standard SQL behavior and honored by pretty much every database.

One of the common use-cases is to emulate the filter clause in databases that don't support it natively: http://modern-sql.com/feature/filter#conforming-alternatives

The exceptions (aggregate functions that don't remove null prior to aggregation) are functions like json_arrayagg, json_objectagg, array_agg and the like.

answered Oct 16 '22 09:10

Markus Winand

MySQL explains it in the documentation of function COUNT():

COUNT(expr)

Returns a count of the number of non-NULL values of expr in the rows retrieved by a SELECT statement.

COUNT(*) is somewhat different in that it returns a count of the number of rows retrieved, whether or not they contain NULL values.

PostgreSQL also explains it in the documentation:

Most aggregate functions ignore null inputs, so that rows in which one or more of the expression(s) yield null are discarded. This can be assumed to be true, unless otherwise specified, for all built-in aggregates.

For example, count(*) yields the total number of input rows; count(f1) yields the number of input rows in which f1 is non-null, since count ignores nulls; and count(distinct f1) yields the number of distinct non-null values of f1.

answered Oct 16 '22 09:10

axiac

count(*) count the number of rows related to the group by colums. Inpependntly of the fatc the the column in group by contain null or not null values

count(nbr) count the number of rows related to the group by column where nbr is not null

answered Oct 16 '22 09:10

ScaisEdge

Related questions
                            
                                PHPUnit and MySQL truncation error
                            
                                Must database primary keys be integers?
                            
                                How to chain DB relationships in Laravel (multiple has_many?)
                            
                                Carbon in Laravel 4 InvalidArgumentException - Unexpected data found. Trailing data
                            
                                mysql add foreign key constraint referencing a view
                            
                                Call to a member function execute() on boolean in [duplicate]
                            
                                ENOENT when connecting to Google Cloud SQL from App Engine
                            
                                Max tables in a MySQL database
                            
                                How I can create installer for website. PHP mysql [closed]
                            
                                Adding an extra column value with INSERT ... SELECT in MySQL
                            
                                MySQL different treatment between VarChar(255) and VarChar(65536)
                            
                                mySQL: get hash value for each row?
                            
                                MySQL, IFNULL(), COALESCE() on String not replacing
                            
                                mysql fetch assoc VS mysql fetch array
                            
                                C# Get insert id with Auto Increment
                            
                                Why we should have an ID column in the table of users?
                            
                                MySQL / PHP: Find similar / related items by tag / taxonomy
                            
                                BASH - If $TIME between 8am and 1pm do.., esle do.. Specifying time variables and if statements in BASH
                            
                                Integrity constraint violation: 1062 Duplicate entry '1' for key 'PRIMARY'
                            
                                Design Pattern for Custom Fields in Relational Database

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

NULL value count in group by

Tags:

sql

mysql

postgresql

rachid el kedmiri

People also ask

3 Answers

Markus Winand

axiac

ScaisEdge

Recent Activity

Donate For Us