Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is there a non-clustered index scan when counting all rows in a table?

Tags:

sql

sql-server

As far as I understand it, each transaction sees its own version of the database, so the system cannot get the total number of rows from some counter and thus needs to scan an index. But I thought it would be the clustered index on the primary key, not the additional indexes. If I had more than one additional index, which one will be chosen, anyway?

When digging into the matter, I've noticed another strange thing. Suppose there are two identical tables, Articles and Articles2, each with three columns: Id, View_Count, and Title. The first has only a clustered PK-based index, while the second one has an additional non-clustered, non-unique index on view_count. The query SELECT COUNT(1) FROM Articles runs 2 times faster for the table with the additional index.

like image 892
synapse Avatar asked Mar 07 '12 06:03

synapse


1 Answers

SQL Server will optimize your query - if it needs to count the rows in a table, it will choose the smallest possible set of data to do so.

So if you consider your clustered index - it contains the actual data pages - possibly several thousand bytes per row. To load all those bytes just to count the rows would be wasteful - even just in terms of disk I/O.

Therefore, it there is a non-clustered index that's not filtered or restricted in any way, SQL Server will pick that data structure to count - since the non-clustered index basically contains the columns you've put into the NC index (plus the clustered index key) - much less data to load just to count the number of rows.

like image 74
marc_s Avatar answered Nov 04 '22 06:11

marc_s