Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server Indexing best practice (SQL Server 2008) [closed]

I have some doubts on choosing the right index and have some questions:

Clustered index

What is the best candidate?

Usually is the primary key but if the primary key is not used in the search by eg CustomerNo is used to search on customers should the clustered index put on CustomerNo?

Views with SchemaBinding

If have a view with indexes I read that these are not used but those on tables are.

Pointless no? Or am I missing the point? Will it make a difference using "NOExpand" to force to read the index from the view rather than the table?

Nonclustered indexes

Is it good practice when adding a nonclustered index to include every possible column till you reach the limit?

Many thanks for your time. I am reading massive database and speed is a must

like image 300
user9969 Avatar asked Feb 11 '14 07:02

user9969


2 Answers

The clustered index is the index that (a) defines the storage layout of your table (the table data is physically sorted by the clustering key), and (b) is used as the "row locator" in every single nonclustered index on that table.

Therefore, the clustered index should be

  • narrow (4 byte is ideal, 8 byte OK - anything else is too much)
  • unique (if you don't use a unique clustered index, SQL Server will add a 4 byte uniqueifier to your table)
  • static (shouldn't change)
  • optimally it should be ever-increasing
  • fixed with - e.g. don't use large Varchar(x) columns in your clustered index

Out of these requirements, the INT IDENTITY seems to be the most logical, most obvious choice. Don't use variable length columns, don't use multiple columns (if ever possible), don't use GUID (that's a horribly bad choice because of it's size and randomness)

For more background info on clustering keys and clustered indexes - read everything that Kimberly Tripp ever publishes! She's the Queen of Indexing in SQL Server - she knows her stuff extremely well!

See e.g. these blog posts:

  • GUIDs as PRIMARY KEYs and/or the clustering key
  • The Clustered Index Debate Continues...
  • Ever-increasing clustering key - the Clustered Index Debate..........again!
  • Disk space is cheap - that's not the point!

In general: don't overindex! too many indices is often worse than none!

For non-clustered indexes: I would typically index foreign key columns - those indexes help with JOINs and other operations and make things faster.

Other than that: don't put too many indexes in your database ! Every index must be maintained on every CRUD operation on your table! This is overhead - don't excessively index!

An index with all columns of a table is an especially bad idea since it really cannot be used for much - but carries a lot of administrative overhead.

Run your app, profile it - see which operations are slow, try to optimize those by adding a few selective indexes to your table.

like image 66
marc_s Avatar answered Sep 27 '22 02:09

marc_s


Clustered Indexes

Just to add to marc_s good answer, one exception to the standard INT IDENTITY PK approach to Clustered Indexes is when you have Parent Child tables, where all the children are frequently always retrieved at the same time as the parent. In this case, clustering by Child table by the Parent PK will reduce the number of pages read when the children are retrieved. For example:

CREATE TABLE Invoice
(
   -- Use the default MS Approach on the parent, viz Clustered by Surrogate PK 
   InvoiceID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
   -- Index Fields here
);


CREATE TABLE InvoiceLineItem
(
   -- Own Surrogate Key
   InvoiceLineItemID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
   InvoiceID INT NOT NULL FOREIGN KEY REFERENCES Invoice(InvoiceID),
   -- Line Item Fields Here
);

-- But Cluster on the Parent FK
CREATE CLUSTERED INDEX CL_InvoiceLineItem ON InvoiceLineItem(InvoiceID);

NonClustered Indexes

No, never just include columns without careful thought - the index tree needs to be as narrow as possible. The ordering of the index columns is critical, and always ensure that the index is designed with selectivity of the data in mind - you will need to have a good understanding of the distribution of your data in order to choose optimal indexes.

You can consider using covering indexes to include (at most, a few) columns which would otherwise have required a bookmark lookup from the Nonclustered index back into the table when tuning performance-critical queries.

like image 43
StuartLC Avatar answered Sep 27 '22 02:09

StuartLC