Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to know when to use indexes and which type?

I've searched a bit and didn't see any similar question, so here goes.

How do you know when to put an index in a table? How do you decide which columns to include in the index? When should a clustered index be used?

Can an index ever slow down the performance of select statements? How many indexes is too many and how big of a table do you need for it to benefit from an index?

EDIT:

What about column data types? Is it ok to have an index on a varchar or datetime?

like image 225
Earlz Avatar asked Mar 09 '10 21:03

Earlz


People also ask

When should you use indexing?

An index in a database is very similar to an index in the back of a book. It means when we want to jump to a specific topic from a book then we can first see the index of the book and then find the page number for that topic and then open that topic by going to that page number.

What is the best data type to use as an index?

For example, the best candidate data type for the SQL Server index is the integer column due to its small size.

How do I know where to put index?

You should use an index on columns that you use for selection and ordering - i.e. the WHERE and ORDER BY clauses. Indexes can slow down select statements if there are many of them and you are using WHERE and ORDER BY on columns that have not been indexed.


1 Answers

Well, the first question is easy:

When should a clustered index be used?

Always. Period. Except for a very few, rare, edge cases. A clustered index makes a table faster, for every operation. YES! It does. See Kim Tripp's excellent The Clustered Index Debate continues for background info. She also mentions her main criteria for a clustered index:

  • narrow
  • static (never changes)
  • unique
  • if ever possible: ever increasing

INT IDENTITY fulfills this perfectly - GUID's do not. See GUID's as Primary Key for extensive background info.

Why narrow? Because the clustering key is added to each and every index page of each and every non-clustered index on the same table (in order to be able to actually look up the data row, if needed). You don't want to have VARCHAR(200) in your clustering key....

Why unique?? See above - the clustering key is the item and mechanism that SQL Server uses to uniquely find a data row. It has to be unique. If you pick a non-unique clustering key, SQL Server itself will add a 4-byte uniqueifier to your keys. Be careful of that!

Next: non-clustered indices. Basically there's one rule: any foreign key in a child table referencing another table should be indexed, it'll speed up JOINs and other operations.

Furthermore, any queries that have WHERE clauses are a good candidate - pick those first which are executed a lot. Put indices on columns that show up in WHERE clauses, in ORDER BY statements.

Next: measure your system, check the DMV's (dynamic management views) for hints about unused or missing indices, and tweak your system over and over again. It's an ongoing process, you'll never be done! See here for info on those two DMV's (missing and unused indices).

Another word of warning: with a truckload of indices, you can make any SELECT query go really really fast. But at the same time, INSERTs, UPDATEs and DELETEs which have to update all the indices involved might suffer. If you only ever SELECT - go nuts! Otherwise, it's a fine and delicate balancing act. You can always tweak a single query beyond belief - but the rest of your system might suffer in doing so. Don't over-index your database! Put a few good indices in place, check and observe how the system behaves, and then maybe add another one or two, and again: observe how the total system performance is affected by that.

like image 173
marc_s Avatar answered Oct 21 '22 07:10

marc_s