Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL count(*) performance

I have a SQL table BookChapters with over 20 millions rows. It has a clustered primary key (bookChapterID) and doesn't have any other keys or indexes. It takes miliseconds to run the following query

if (select count(*) from BookChapters) = 0 ... 

However, it takes over 10 minutes when I change it like so

if (select count(*) from BookChapters) = 1 ... 

or

if (select count(*) from BookChapters) > 1 ... 

Why is that? How can I get select count(*) to execute faster?

like image 244
danmiao Avatar asked Jun 21 '12 01:06

danmiao


People also ask

Which is faster count (*) or count column?

The simple answer is no – there is no difference at all. The COUNT(*) function counts the total rows in the table, including the NULL values.

Which is faster count (*) or count ID?

Your use of COUNT(*) or COUNT(column) should be based on the desired output only. ... if you have a non-nullable column such as ID, then count(ID) will significantly improve performance over count(*). The two seem to contradict each other.

Why count 1 is faster than count (*)?

There is no difference. "1" is a non-null expression: so it's the same as COUNT(*) . The optimizer recognizes it for what it is: trivial.

How do you make a count query faster in SQL?

So to make SELECT COUNT(*) queries fast, here's what to do:Get on any version that supports batch mode on columnstore indexes, and put a columnstore index on the table – although your experiences are going to vary dramatically depending on the kind of query you have.


1 Answers

Mikael Eriksson has a good explanation bellow why the first query is fast:

SQL server optimize it into: if exists(select * from BookChapters). So it goes looking for the presence of one row instead of counting all the rows in the table.

For the other two queries, SQL Server would use the following rule. To perform a query like SELECT COUNT(*), SQL Server will use the narrowest non-clustered index to count the rows. If the table does not have any non-clustered index, it will have to scan the table.

Also, if your table has a clustered index you can get your count even faster using the following query (borrowed from this site Get Row Counts Fast!)

--SQL Server 2005/2008 SELECT OBJECT_NAME(i.id) [Table_Name], i.rowcnt [Row_Count] FROM sys.sysindexes i WITH (NOLOCK) WHERE i.indid in (0,1) ORDER BY i.rowcnt desc  --SQL Server 2000 SELECT OBJECT_NAME(i.id) [Table_Name], i.rows [Row_Count] FROM sysindexes i (NOLOCK) WHERE i.indid in (0,1) ORDER BY i.rows desc 

It uses sysindexes system table. More info you can find here SQL Server 2000, SQL Server 2005, SQL Server 2008, SQL Server 2012

Here is another link Why is my SELECT COUNT(*) running so slow? with another solution. It shows technique that Microsoft uses to quickly display the number of rows when you right click on the table and select properties.

select sum (spart.rows) from sys.partitions spart where spart.object_id = object_id(’YourTable’) and spart.index_id < 2 

You should find that this returns very quickly no matter how many tables you have.

If you are using SQL 2000 still you can use the sysindexes table to get the number.

select max(ROWS) from sysindexes where id = object_id(’YourTable’) 

This number may be slightly off depending on how often SQL updates the sysindexes table, but it’s usually corrent (or at least close enough).

like image 136
Aleksey Cherenkov Avatar answered Sep 21 '22 17:09

Aleksey Cherenkov