Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NewSequentialId on UniqueIdentifier Clustered Index

I am working on database standards for a new database my company is starting. One of the things we are trying to define is Primary Key and Clustered Index rules in relation to UniqueIdentifiers.

(NOTE: I do not want a discussion on the pros and cons of using a UniqueIdentifier as a primary key or clustered index. There is a ton of info on the web about that. This is not that discussion.)

So here is the scenario that has me worried:

Say I have a table with a UniqueIdentifier as the clustered index and primary key. Lets call it ColA. I set the default value for ColA to be NewSequentialId().

Using that NewSequentialId() I insert three sequential rows:

{72586AA4-D2C3-440D-A9FE-CC7988DDF065}
{72586AA4-D2C3-440D-A9FE-CC7988DDF066}
{72586AA4-D2C3-440D-A9FE-CC7988DDF067}

Then I reboot my server. The docs for NewSequentialId say that "After restarting Windows, the GUID can start again from a lower range, but is still globally unique."

So the next starting point can be lower than the previous range.

So after the restart, I insert 3 more values:

{35729A0C-F016-4645-ABA9-B098D2003E64}
{35729A0C-F016-4645-ABA9-B098D2003E65}
{35729A0C-F016-4645-ABA9-B098D2003E66}

(I am not sure exactly how the guid is represented in the database, but lets assume since this one starts with 3 and the previous ones started with 7 that the 3 ones are "smaller" than the 7 ones.)

When you do an insert that is in the middle of a clustered index, a remapping of the index has to happen. (At least so my DBA has told me.) And every time I reboot I run the risk of having my new UniqueIdentifier range be right in the middle of other previous ranges.

So my question is: Since the next set of UniqueIdentifiers will be smaller than the last set, will every insert cause my clustered index to shuffle?

And if not, why? Does SQL Server know that I am using NewSequentialId? Does it some how compensate for that?

If not, then how does it know what I will insert next? Maybe the next million inserts will start with 3. Or maybe they will start with 7. How does it know?

Or does it not know and just keeps everything in order. If that is the case then one reboot could massively affect performance. (Which makes me think I need my own custom NewSequentialId that is not affected by reboots.) Is that correct? Or is there some magic I am not aware of?

EDIT: GUID as a clustered index is strongly discouraged in my standard. As I said above, there are many reasons that this is a bad idea. I am trying to find out if this is another reason why.

like image 746
Vaccano Avatar asked Aug 04 '11 15:08

Vaccano


1 Answers

Normally you will create your indexes with an appropriate FILL FACTOR to leave empty space in all your pages for just such a scenario. That being said, the clustered index does get reordered once the empty space is filled.

I know you don't want to discuss using GUID as a clustered key, but this is one of the reasons that it's not a recommended practice.

What will happen is that you will have an increasing volume of page splits, which will lead to a very high level of fragmentation as you keep inserting rows, and you will need to rebuild your index at a higher frequency to keep performance in line.

For a full treatment on the topic, there's no better source than

Kim
Tripp's
Blog

As a side note, when you are considering creating your own NewSequentialID creation function, you probably have a design issue and should reconsider your plan.

like image 189
JNK Avatar answered Nov 15 '22 04:11

JNK