Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does performance degrade when using a non-indexed field in the SELECT clause?

Tags:

sql-server

Consider these three queries:

select sampleno from sample
    where markupdate > '1/1/2010'

select sampleno, markupdate from sample
    where markupdate > '1/1/2010'

select sampleno, markuptime from sample
    where markupdate > '1/1/2010'

sampleno and markupdate are indexed fields (sampleno is the primary key)

markuptime is not indexed

Queries 1 and 2 take about 1 second to run (returning 237K rows). Query 3 is still running after 3 minutes.

Why would the inclusion of a non-indexed field in the SELECT clause cause such a performance degradation?

This is a SQL 6.5 database.

like image 687
blueshift Avatar asked Feb 08 '13 16:02

blueshift


People also ask

How do indexes affect SQL performance?

A useful SQL Server index enhances the query and system performance without impacting the other queries. On the other hand, if you create an index without any preparation or consideration, it might cause performance degradations, slow data retrieval and could consume more critical resources such as CPU, IO and memory.

What is disadvantage of nonclustered index?

Disadvantages: Store the data in the most logical order without permitting to sort the data rows physically. Lookup process on the non-clustered index is a bit expensive. A corresponding update is required every time the clustering key is updates so as to store it on the non-clustered index.

How can indexes degrading performance?

Indexes will degrade insert/delete performance since indexes have to be updated. In case of update it depends on whether you update indexed columns. If not, performance should not be affected. Indexes can also speed up a DELETE and UPDATE statements if the WHERE condition can make use of the index.

Does indexing improve query performance?

Indexing makes columns faster to query by creating pointers to where data is stored within a database. Imagine you want to find a piece of information that is within a large database. To get this information out of the database the computer will look through every row until it finds it.


1 Answers

A table's data (basically: all columns) is stored in a clustered index. A clustered index is a binary tree that allows a binary search on the indexed column(s). It is special (clustered) in that it contains all other columns at the leaf level. Usually, the clustered index is also the primary key. In your case, it's:

(sampleno) include (markupdate, markuptime, ...)

A non-clustered index contains the indexed column(s) and (at the leaf level) the clustered index. When you use a non-clustered index, the database has to look up all the other columns in the clustered index. That process is called a lookup. In your case, the non-clustered index on (markupdate) is:

(markupdate) include (sampleno)

This index contains all data for a query on markupdate, sampleno. The technical term for such an index is a covering index. But when you add markuptime to the query, the index is no longer covering. It has to look up the value for markuptime in the clustered index. And lookups are expansive.

Only your third query requires lookups. And that's why your third query is slower.

like image 188
Andomar Avatar answered Sep 28 '22 01:09

Andomar