Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do clustered indexes have to be unique?

What happens if a clustered index is not unique? Can it lead to bad performance because inserted rows flow to an "overflow" page of some sorts?

Is it "made" unique and if so how? What is the best way to make it unique?

I am asking because I am currently using a clustered index to divide my table in logical parts, but the performance is so-so, and recently I got the advice to make my clustered indexes unique. I'd like a second opinion on that.

like image 242
littlegreen Avatar asked Dec 02 '10 08:12

littlegreen


People also ask

Do clustered indexes need to be unique?

SQL Server does not require a clustered index to be unique, but yet it must have some means of uniquely identifying every row. That's why, for non-unique clustered indexes, SQL Server adds to every duplicate instance of a clustering key value a 4-byte integer value called a uniqueifier.

What happens if clustered index is not unique?

Internally, if the clustering key is not unique then SQL Server will “uniquify” it by adding a 4-byte integer to the data.

Can clustered index have duplicates?

Yes, you can create a clustered index on key columns that contain duplicate values.

Is clustered index default unique?

PRIMARY KEY and UNIQUE constraints When you create a UNIQUE constraint, a unique nonclustered index is created to enforce a UNIQUE constraint by default. You can specify a unique clustered index if a clustered index on the table does not already exist.


1 Answers

They don't have to be unique but it certainly is encouraged.
I haven't encountered a scenario yet where I wanted to create a CI on a non-unique column.

What happens if you create a CI on a non-unique column

If the clustered index is not a unique index, SQL Server makes any duplicate keys unique by adding an internally generated value called a uniqueifier

Does this lead to bad performance?

Adding a uniqueifier certainly adds some overhead in calculating and in storing it.
If this overhead will be noticable depends on several factors.

  • How much data the table contains.
  • What is the rate of inserts.
  • How often is the CI used in a select (when no covering indexes exist, pretty much always).

Edit
as been pointed out by Remus in comments, there do exist use cases where creating a non-unique CI would be a reasonable choice. Me not having encountered one off those scenarios merely shows my own lack of exposure or competence (pick your choice).

like image 77
Lieven Keersmaekers Avatar answered Oct 09 '22 14:10

Lieven Keersmaekers