I'm trying to understand how sequential guid performs better than a regular guid. Is it because with regular guid, the index use the last byte of the guid to sort? Since it's random it will cause alot of fragmentation and page splits since it will often move data to another page to insert new data? Sequential guid sine it is sequential it will cause alot less page splits and fragmentation? Is my understanding correct? If anyone can shed more lights on the subject, I'll appreciated very much. Thank you EDIT: Sequential guid = NEWSEQUENTIALID(), Regular guid = NEWID()

You've pretty much said it all in your question. With a sequential GUID / primary key new rows will be added together at the end of the table, which makes things nice an easy for SQL server. In comparison a random primary key means that new records could be inserted anywhere in the table - the chance of the last page for the table being in the cache is fairly likely (if that's where all of the reads are going), however the chance of a random page in the middle of the table being in the cache is fairly low, meaning additional IO is required. On top of that, when inserting rows into the middle of the table there is the chance that there isn't enough room to insert the extra row. If this is the case then SQL server needs to perform additional expensive IO operations in order to create room for the record - the only way to avoid this is to have gaps scattered amongst the data to allow for extra records to be inserted (known as a Fill factor), which in itself causes performance issues because the data is spread over more pages and so more IO is required to access the entire table.

Sequential Guid and fragmentation

Tags:

guid

sql-server

database-fragmentation

I'm trying to understand how sequential guid performs better than a regular guid.

Is it because with regular guid, the index use the last byte of the guid to sort? Since it's random it will cause alot of fragmentation and page splits since it will often move data to another page to insert new data?

Sequential guid sine it is sequential it will cause alot less page splits and fragmentation?

Is my understanding correct?

If anyone can shed more lights on the subject, I'll appreciated very much.

Thank you

EDIT:

Sequential guid = NEWSEQUENTIALID(),

Regular guid = NEWID()

768

asked Aug 10 '10 14:08

pdiddy

2 Answers

You've pretty much said it all in your question.

With a sequential GUID / primary key new rows will be added together at the end of the table, which makes things nice an easy for SQL server. In comparison a random primary key means that new records could be inserted anywhere in the table - the chance of the last page for the table being in the cache is fairly likely (if that's where all of the reads are going), however the chance of a random page in the middle of the table being in the cache is fairly low, meaning additional IO is required.

On top of that, when inserting rows into the middle of the table there is the chance that there isn't enough room to insert the extra row. If this is the case then SQL server needs to perform additional expensive IO operations in order to create room for the record - the only way to avoid this is to have gaps scattered amongst the data to allow for extra records to be inserted (known as a Fill factor), which in itself causes performance issues because the data is spread over more pages and so more IO is required to access the entire table.

answered Sep 29 '22 21:09

Justin

I defer to Kimberly L. Tripp's wisdom on this topic:

But, a GUID that is not sequential - like one that has it's values generated in the client (using .NET) OR generated by the newid() function (in SQL Server) can be a horribly bad choice - primarily because of the fragmentation that it creates in the base table but also because of its size. It's unnecessarily wide (it's 4 times wider than an int-based identity - which can give you 2 billion (really, 4 billion) unique rows). And, if you need more than 2 billion you can always go with a bigint (8-byte int) and get 263-1 rows.

Read more: http://www.sqlskills.com/BLOGS/KIMBERLY/post/GUIDs-as-PRIMARY-KEYs-andor-the-clustering-key.aspx#ixzz0wDK6cece

answered Sep 29 '22 20:09

Joe Stefanelli

Related questions
                            
                                how to find rowsize in table
                            
                                MySQL vs SQL Server 2005/2008 performance
                            
                                What is the best way to collapse the rows of a SELECT into a string?
                            
                                identity column in Sql server
                            
                                How did my database security fail?
                            
                                How can you cancel a SQL Server execution process programmatically
                            
                                Get first Sunday of next month using T-SQL
                            
                                MS Access error "ODBC--call failed. Invalid character value for cast specification (#0)"
                            
                                SQL Server Data Archive Solution
                            
                                Visual Studio 2005 doesn't support Sql Server 2008
                            
                                SQL: Deleting duplicate records in SQL Server
                            
                                Use SOUNDEX() word by word on SQL Server
                            
                                Is there a succinct way to retrieve a list of table column names with T-SQL?
                            
                                The faster of two SQL queries, sort and select top 1, or select MAX
                            
                                SQL Server Concatenate string column value to 5 char long
                            
                                LINQ Query Returning Multiple Copies Of First Result
                            
                                What data type do I choose for storing plain text in Microsoft SQL Server 2008?
                            
                                Single If Statement needs Begin & End in code block
                            
                                T-Sql Select * Between 30% and 40%
                            
                                What's the fastest method to check SQL server availability?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With