Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count(*) vs Count(Id) in sql server 2005

I use SQL COUNT function to get the total number or rows from a table. Is there any difference between the two following statements?

SELECT COUNT(*) FROM Table

and

SELECT COUNT(TableId) FROM Table

Also, is there any difference in terms of performance and execution time?

like image 855
ACP Avatar asked Feb 08 '10 05:02

ACP


People also ask

Which is faster COUNT (*) or COUNT ID?

Your use of COUNT(*) or COUNT(column) should be based on the desired output only. ... if you have a non-nullable column such as ID, then count(ID) will significantly improve performance over count(*).

What is the difference between COUNT (*) and COUNT column_name )?

The difference is: COUNT(*) will count the number of records. COUNT(column_name) will count the number of records where column_name is not null.

Is COUNT (*) slower than COUNT ID?

The simple answer is no – there is no difference at all. The COUNT(*) function counts the total rows in the table, including the NULL values.

What does COUNT (*) do in SQL?

COUNT(*) returns the number of rows in a specified table, and it preserves duplicate rows. It counts each row separately. This includes rows that contain null values.


2 Answers

Thilo nailed the difference precisely... COUNT( column_name ) can return a lower number than COUNT( * ) if column_name can be NULL.

However, if I can take a slightly different angle at answering your question, since you seem to be focusing on performance.

First, note that issuing SELECT COUNT(*) FROM table; will potentially block writers, and it will also be blocked by other readers/writers unless you have altered the isolation level (knee-jerk tends to be WITH (NOLOCK) but I'm seeing a promising number of people finally starting to believe in RCSI). Which means that while you're reading the data to get your "accurate" count, all these DML requests are piling up, and when you've finally released all of your locks, the floodgates open, a bunch of insert/update/delete activity happens, and there goes your "accurate" count.

If you need an absolutely transactionally consistent and accurate row count (even if it is only valid for the number of milliseconds it takes to return the number to you), then SELECT COUNT( * ) is your only choice.

On the other hand, if you are trying to get a 99.9% accurate ballpark, you are much better off with a query like this:

SELECT row_count = SUM(row_count)
  FROM sys.dm_db_partition_stats
  WHERE [object_id] = OBJECT_ID('dbo.Table')
  AND index_id IN (0,1);

(The SUM is there to account for partitioned tables - if you are not using table partitioning, you can leave it out.)

This DMV maintains accurate row counts for tables with the exception of rows that are currently participating in transactions - and those very transactions are the ones that will make your SELECT COUNT query wait (and ultimately make it inaccurate before you have time to read it). But otherwise this will lead to a much quicker answer than the query you propose, and no less accurate than using WITH (NOLOCK).

like image 115
Aaron Bertrand Avatar answered Oct 22 '22 00:10

Aaron Bertrand


count(id) needs to null-check the column (which may be optimized away for a primary key or otherwise not-null column), so count(*) or count(1) should be prefered (unless you really want to know the number of rows with a non-null value for id).

like image 13
Thilo Avatar answered Oct 22 '22 00:10

Thilo