CREATE NONCLUSTERED INDEX index_name
ON <object> ( column [ ASC | DESC ] [ ,...n ] )
[ INCLUDE ( column_name [ ,...n ] ) ]
[ WHERE <filter_predicate> ]
In the above syntax we will specify non-key columns in the INCLUDE. What is the advantage in specifying non-key columns?
The include clause allows us to make a distinction between columns we would like to have in the entire index (key columns) and columns we only need in the leaf nodes ( include columns). That means it allows us to remove columns from the non-leaf nodes if we don't need them there.
Included columns can be used to create a covering indexes without including all the data into the key columns. This covering index has the advantage that the index contains all the columns that are needed for a query.
It helps us find the rows or the values to be searched faster. They have various advantages like increased performance in searching for records, sorting records, grouping records, or maintaining a unique column.
The INCLUDE clause, applicable only on unique indexes, specifies additional columns to be appended to the set of index key columns. Any columns included with this clause are not used to enforce uniqueness. These included columns can improve the performance of some queries through index only access.
Let's assume you have an employee table like so:
CREATE TABLE Employee(EmployeeID INT IDENTITY(1,1) PRIMARY KEY,
LastName VARCHAR(50),
FirstName VARCHAR(50),
HireDate DATETIME,
Salary DECIMAL)
You would have the primary clustered key on EmployeeID, and possibly a non-clustered key on (LastName,FirstName) in order to be able to find employees by name.
CREATE INDEX NameIndex ON Employee(LastName ASC, FirstName ASC)
Now if you need to find "Joe Murphy" and retrieve his hire date and salary, what happens is an index seek in your name-based non-clustered key (which is good), but then in order to fetch the hire date and salary, SQL Server needs to do a so-called bookmark lookup into the actual table data to get the record for Joe Murphy. This will most likely incur one or several physical disk accesses (which is bad in terms of performance).
HOWEVER: if your name-based non-clustered index also specifies "INCLUDE (HireDate, Salary)":
CREATE INDEX NameIndex ON Employee(LastName ASC, FirstName ASC)
INCLUDE (HireDate, Salary)
then SQL Server is done once it's looked up Joe Murphy in the non-clustered name index --> all the fields to satisfy your query are in the non-clustered index, so there's no more need to do a disk-intensive bookmark lookup and your queries will be potentially much faster.
The downside of INCLUDE columns are increased disk-space need by non-clustered indexes, since they will have the included columns in their leaf-level nodes. It's a trade-off between speed and size (as usual).
Marc
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With