I know they return different results (the first counts nulls, the latter not). That's not my question. Just imagine a case where I don't care (either because there are no nulls, or because there are only a few and I only want a general sense of the number of rows in the database).
My question is about the following (presumable) contradiction:
Here one of the highest rep users in the SQL tag says
Your use of
COUNT(*)
orCOUNT(column)
should be based on the desired output only.
On the other hand, here is a 47 times upvoted comment saying
... if you have a non-nullable column such as ID, then count(ID) will significantly improve performance over count(*).
The two seem to contradict each other. So can someone please explain to me why is whatever the correct one correct?
Expanding on scsimon's answer, I ran a test of 10 million rows on SQL Server 2017. Performance results are at the bottom.
EDIT
With suggestions from Jeroen's comment, I added more columns to the test table: an int identity
, a bigint not null
, and a nullable tinyint
. In each case, the select count
query chose a scan of the index on the tinyint column. The estimated cost was identical between the queries, but the select count(NullableColumn)
was noticeably slower than other select count
methods.
Code
if object_id('tempdb..#table') is null
begin
declare @Count bigint = 10000000
create table #Table
(
id int identity primary key clustered,
BigID bigint not null,
TinyC tinyint null
);
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
E16(N) AS (SELECT 1 FROM E4 a, E4 b), --10E+16 or 10,000,000,000,000,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E16
)
insert into #table (BigId, TinyC)
select N, case when N % 2 = 0 then null else N % 8 end
from cteTally
where N <= @Count;
create unique index IX_BigID on #table (BigID);
create index IX_TinyC on #table (TinyC);
end
set statistics io on
set statistics time on
print 'count(*)'
select count(*) from #table
option (maxdop 1)
print 'count(ID)'
select count(ID) from #table
option (maxdop 1)
print 'count(BigID)'
select count(BigID) from #table
option (maxdop 1)
print 'count(TinyC)'
select count(TinyC) from #table
option (maxdop 1)
set statistics io off
set statistics time off
--drop table #table
Performance
count(*)
Table '#Table'. Scan count 1, logical reads 13617, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 735 ms, elapsed time = 746 ms
count(ID) -- int identity primary key clustered
Table '#Table'. Scan count 1, logical reads 13617, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 765 ms, elapsed time = 776 ms
count(BigID) -- bigint not null, indexed
Table '#Table'. Scan count 1, logical reads 13617, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 735 ms, elapsed time = 731 ms
count(TinyC) -- tinyint nullable, indexed
Warning: Null value is eliminated by an aggregate or other SET operation.
Table '#Table'. Scan count 1, logical reads 13617, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 1593 ms, elapsed time = 1584 ms.
I would expect count(column)
to be a bit slower, but that wasn't the case in a quick test (below) but what's more important is the first link you posted... since count(*)
and count(column)
can produce different results based on the column being null-able. Also, I'm assuming you aren't returning any other columns and thus removing tests which would be unique to your environment and indexes.
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N
into #table
from cteTally
update #table
set N = null
where N % 2 = 0
select count(*) from #table
select count(N) from #table
Similarly, on a table with 361,912 rows, here are the results for count(*)
, count(pk_column)
, and count(nullable_column)
which isn't part of an index:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With