Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clustered indexes SQL Server

I have an Oracle background, and using "Indexed organized tables" (IOT) for every table sounds unreasonable in Oracle and I never actually seen this. In SQL Server, every database I worked on, has a clustered index on every table, which is the same as IOT (conceptually).

Why is that? Is there any reason for using clustered index everywhere? Seems to me like they would be good only for a handful of cases.

Thanks

like image 472
Younes Avatar asked Apr 15 '12 12:04

Younes


2 Answers

A clustered index is not quite the same thing as an index-organised table. With an IOT, every field must participate in the IOT key. A clustered index on SQL Server does not have to be unique, and does not have to be the primary key.

Clustered indexes are widely used on SQL Server, as there is almost always some natural ordering that makes a commonly used query more efficient. IOTs in Oracle carry more baggage, so they aren't quite as useful, although they may well be more useful then they're commonly given credit for.

Historically, really old versions of SQL Server pre 6.5 or 7.0 IIRC did not support row-level locking and could only lock at a table or page level. Often a clustered index would be used to ensure that writes were scattered around the table's physical storage to minimise contention on page locks. However, SQL Server 6 went of support some years ago, so applications with this issue will be restricted to rare legacy systems.

like image 59
ConcernedOfTunbridgeWells Avatar answered Oct 16 '22 19:10

ConcernedOfTunbridgeWells


Without a clustered index, your table is organized as a heap. This means that every row that is insert is added at the data page at the end of the table. Also as rows get updated, they get moved to the data page at the end of the table if the data updated is larger than than before.

When it is good to not have a clustered index

If you have a table that needs the fastest possible inserts, but can sacrifice update, and read speed, then not having a clustered index may work for you. One example would be if you had a table that was being used as a queue, for instance, lots of inserts that later just get read and moved to a different table.

Clustered Indexes

Clustered indexes organize the data in your table based on the columns in the clustered index. If you cluster on the wrong thing for instance a uniqueidentifier this can slow things down (see below).

As long as your clustered index is on the value that is most commonly used for searching, and it is unique and increasing they you get some amazing performance benefits out of the clustered index. For instance if you have a table called USERS where you are commonly looking up user data based on USER_ID then clustering on USER_ID would speed up the performance of all of those lookups. This simply reduces the number of data pages that need to be read to get at your data.

If you have too many keys in your clustered index this can slow things down also.

General rules for clustered indexes:

Don't cluster on any varchar columns.

Clustering on INT IDENTITY columns is usually best.

Cluster on what you commonly search on.

Clustering on UniqueIdentifiers

With uniqueidentifiers in an index, they are extremely inefficient because there is no natural sort order. Based on the b-tree structure of the index you end up with extremely fragmented indexes when using uniqueidentifiers. After rebuilding or reorganizing, they are still extremely fragmented. So you end up with a slower index, that ends up being really huge in memory and on disk due to the fragmentation. Also on inserts of the uniqueidentifier you are more likely to end up with a page split on the index thus slowing your insert. Generally uniqueidentifiers are bad news for indexes.

Summary

My recommendation is that every table should have a clustered index on it unless there is a really good reason not to (ie table functioning as a queue).

like image 39
Steve Stedman Avatar answered Oct 16 '22 20:10

Steve Stedman