Hello SQL Server engine experts; please share with us a bit of your insight...
As I understand it, INCLUDE columns on a non-clustered index allow additional, non-key data to be stored with the index pages.
I am well aware of the performance benefits a clustered index has over a non-clustered index simply due to the 1 fewer step the engine must take in a retrieval in order to arrive at the data on disk.
However, since INCLUDE columns live in a non-clustered index, can the following query be expected to have essentially the same performance across scenarios 1 and 2 since all columns could be retrieved from the index pages in scenario 2 rather than ever resorting to the table data pages?
QUERY
SELECT A, B, C FROM TBL ORDER BY A
SCENARIO 1
CREATE CLUSTERED INDEX IX1 ON TBL (A, B, C);
SCENARIO 2
CREATED NONCLUSTERED INDEX IX1 ON TBL (A) INCLUDE (B, C);
If you want to select only the index value that is used to create and index, non-clustered indexes are faster.
An index with included columns can greatly improve query performance because all columns in the query are included in the index; The query optimizer can locate all columns values within the index without accessing table or clustered index resulting in fewer disk I/O operations.
In this case, you can create a large number of SQL Server indexes, adding all required columns as index key or non-key columns to enhance the performance of the SELECT queries and get the requested data faster. Another thing to consider when indexing a database table is the size of the table.
Indeed a non-clustered index with covering include columns can play exactly the same role as a clustered index. The cost is at update time: more include columns means more indexes have to be updated when an included column value is changed in the base table (in the clustered index). Also, with more included columns, the size-of-data increases: the database becomes larger and this can complicate maintenance operations.
In the end, is a balance you have to find between the covering value of the additional indexes and more included columns vs. the cost of update and data size increase.
For this example you may actually get better performance with the non-clustered index. But, it really depends on additional information you haven't provided. Here are some thoughts.
SQL Server stores information in 8KB pages; this includes data and indexes. If your table only includes columns A, B and C, then the data will be stored in approximately the same number of data pages and the non-clustered index pages. But, if you have more columns in the table, then the data will need more pages. The number of index pages wouldn't be any different.
So, in a table with more columns than your query needs, the query will work better with the non-clustered covering index (index with all columns). It will be able to process fewer pages to return the results you want.
Of course, performance differences may not be seen until you get a very large number of rows.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With