In a database where all of your primary keys are GUIDs, what are the differences/implications and/or pros and cons using newid() versus newsequentialid() as the "default value or binding".
The only difference that I know of is that newid() creates a new random GUID as opposed to newsequentialid() creates a new GUID based on the last one that is in the table in an incremented fashion.
When you perform an insert in a row the DB, it will be inserted in order relative to the other PK's in the table. With a normal guid, this could be anywhere in the table. A newsequentialid() will always be added to the end of the table.
So the performance of inserts is improved.
This site explains the differences and benchmarks between the two different methods.
Update - the blog post referenced has been moved. The link now refers to an web.archive.org link. Here is the key takeaway:
Most striking is the number of writes required by the NEWID system function. This, coupled with the average page density of 69%, is evidence of the page splitting caused by the random distribution of inserts at the leaf level. As soon as a page fills up, it needs to be split into 2 pages of 50% each for the insert to complete. Not only has page splitting resulted in poor page density, it has fragmented the data pages quite badly (there is a 99% probability that the next data page is not next to the current one). In our tests the most likely place for a free page required for the page split is at the end of the table irrespective of where the row is being inserted. Therefore to read the rows in order the scan needs to keep jumping back and forth between widely distributed split pages, hence the appalling fragmentation.
--Stefan Delmarco
Regarding the use of sequential keys (as with identity, sequence, and NEWSEQUENTIALID) vs. nonsequential ones (as with NEWID or a custom randomized key generator), there are several aspects to consider.
Starting with sequential keys, all rows go into the right end of the index. When a page is full, SQL Server allocates a new page and fills it. This results in less fragmentation in the index, which is beneficial for read performance. Also, insertions can be faster when a single session is loading the data, and the data resides on a single drive or a small number of drives.
However, with high-end storage subsystems that have many spindles, the situation can be different. When loading data from multiple sessions, you will end up with page latch contention (latches are objects used to synchronize access to database pages) against the rightmost pages of the index leaf level’s linked list. This bottleneck prevents use of the full throughput of the storage subsystem. Note that if you decide to use sequential keys and you’re using numeric ones, you can always start with the lowest value in the type to use the entire range. For example, instead of starting with 1 in an INT type, you could start with -2,147,483,648.
Consider nonsequential keys, such as random ones generated with NEWID or with a custom solution. When trying to force a row into an already full page, SQL Server performs a classic page split—it allocates a new page and moves half the rows from the original page to the new one. A page split has a cost, plus it results in index fragmentation. Index fragmentation can have a negative impact on the performance of reads. However, in terms of insert performance, if the storage subsystem contains many spindles and you’re loading data from multiple sessions, the random order can actually be better than sequential despite the splits.
That’s because there’s no hot spot at the right end of the index, and you use the storage subsystem’s available throughput better. A good example for a benchmark demonstrating this strategy can be found in a blog by Thomas Kejser at http://blog.kejser.org/2011/10/05/boosting-insert-speed-by-generating-scalable-keys/.
Source: Querying Microsoft® SQL Server® 2012 Exam 70-461 Training Kit
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With