I have recently started a new job and noticed that all the SQL tables use the GUID data type for the primary key. In my previous job we used integers (Auto-Increment) for the primary key and it was a lot more easier to work with in my opinion. For example, say you had two related tables; Product and ProductType - I could easily cross check the 'ProductTypeID' column of both tables for a particular row to quickly map the data in my head because its easy to store the number (2,4,45 etc) as opposed to (E75B92A3-3299-4407-A913-C5CA196B3CAB). The extra frustration comes from me wanting to understand how the tables are related, sadly there is no Database diagram :( A lot of people say that GUID's are better because you can define the unique identifer in your C# code for example using NewID() without requiring SQL SERVER to do it - this also allows you to know provisionally what the ID will be.... but I've seen that it is possible to still retrieve the 'next auto-incremented integer' too. A DBA contractor reported that our queries could be up to 30% faster if we used the Integer type instead of GUIDS... Why does the GUID data type exist, what advantages does it really provide?... Even if its a choice by some professional there must be some good reasons as to why its implemented?

GUIDs are good as identity fields in certain cases: <ul> <li>When you have multiple instances of SQL (different servers) and you need to combine the different updates later on without affecting referential integrity</li> <li>Disconnected clients that create data - this way they can create data without worrying that the ID field is already taken</li> </ul> GUIDs are generated to be globally unique, which is why they are suited for such scenarios.

SQL GUID Vs Integer

Tags:

guid

sql-server-2008

auto-increment

I have recently started a new job and noticed that all the SQL tables use the GUID data type for the primary key.

In my previous job we used integers (Auto-Increment) for the primary key and it was a lot more easier to work with in my opinion.

For example, say you had two related tables; Product and ProductType - I could easily cross check the 'ProductTypeID' column of both tables for a particular row to quickly map the data in my head because its easy to store the number (2,4,45 etc) as opposed to (E75B92A3-3299-4407-A913-C5CA196B3CAB).

The extra frustration comes from me wanting to understand how the tables are related, sadly there is no Database diagram :(

A lot of people say that GUID's are better because you can define the unique identifer in your C# code for example using NewID() without requiring SQL SERVER to do it - this also allows you to know provisionally what the ID will be.... but I've seen that it is possible to still retrieve the 'next auto-incremented integer' too.

A DBA contractor reported that our queries could be up to 30% faster if we used the Integer type instead of GUIDS...

Why does the GUID data type exist, what advantages does it really provide?... Even if its a choice by some professional there must be some good reasons as to why its implemented?

770

asked May 10 '10 17:05

Dalbir Singh

2 Answers

GUIDs are good as identity fields in certain cases:

When you have multiple instances of SQL (different servers) and you need to combine the different updates later on without affecting referential integrity
Disconnected clients that create data - this way they can create data without worrying that the ID field is already taken

GUIDs are generated to be globally unique, which is why they are suited for such scenarios.

160

answered Oct 21 '22 18:10

Oded

Contrary to what most folks here seem to preach, I see GUID's as more of a plague than a blessing. Here's why:

GUIDs may seem to be a natural choice for your primary key - and if you really must, you could probably argue to use it for the PRIMARY KEY of the table. What I'd strongly recommend not to do is use the GUID column as the clustering key, which SQL Server does by default, unless you specifically tell it not to.

You really need to keep two issues apart:

the primary key is a logical construct - one of the candidate keys that uniquely and reliably identifies every row in your table. This can be anything, really - an INT, a GUID, a string - pick what makes most sense for your scenario.
the clustering key (the column or columns that define the "clustered index" on the table) - this is a physical storage-related thing, and here, a small, stable, ever-increasing data type is your best pick - INT or BIGINT as your default option.

By default, the primary key on a SQL Server table is also used as the clustering key - but that doesn't need to be that way! I've personally seen massive performance gains when breaking up the previous GUID-based Primary / Clustered Key into two separate key - the primary (logical) key on the GUID, and the clustering (ordering) key on a separate INT IDENTITY(1,1) column.

As Kimberly Tripp - the Queen of Indexing - and others have stated a great many times - a GUID as the clustering key isn't optimal, since due to its randomness, it will lead to massive page and index fragmentation and to generally bad performance.

Yes, I know - there's newsequentialid() in SQL Server 2005 and up - but even that is not truly and fully sequential and thus also suffers from the same problems as the GUID - just a bit less prominently so. Plus, you can only use it as a default for a column in your table - you cannot get a new sequential GUID in T-SQL code (like a trigger or something) - another major drawback.

Then there's another issue to consider: the clustering key on a table will be added to each and every entry on each and every non-clustered index on your table as well - thus you really want to make sure it's as small as possible. Typically, an INT with 2+ billion rows should be sufficient for the vast majority of tables - and compared to a GUID as the clustering key, you can save yourself hundreds of megabytes of storage on disk and in server memory.

Quick calculation - using INT vs. GUID as Primary and Clustering Key:

Base Table with 1'000'000 rows (3.8 MB vs. 15.26 MB)
6 nonclustered indexes (22.89 MB vs. 91.55 MB)

TOTAL: 25 MB vs. 106 MB - and that's just on a single table!

Some more food for thought - excellent stuff by Kimberly Tripp - read it, read it again, digest it! It's the SQL Server indexing gospel, really.

GUIDs as PRIMARY KEY and/or clustered key
The clustered index debate continues
Ever-increasing clustering key - the Clustered Index Debate..........again!

Marc

answered Oct 21 '22 17:10

marc_s

Related questions
                            
                                Getting two counts and then dividing them
                            
                                Sql server update multiple columns from another table
                            
                                SQL Server 2008 Generate a Series of date times
                            
                                Add version control to existing SQL Server database [closed]
                            
                                Implementing one-to-zero-or-one relation in SQL Server
                            
                                Using CASE to Return a String If No Results From SELECT Statement
                            
                                Change column name while using PIVOT SQL Server 2008
                            
                                <table-valued function> is not a recognized built-in function name
                            
                                Nvarchar and text are incompatible in the add operator
                            
                                How do I count decimal places in SQL?
                            
                                How to create daily backup with unique name in sql server
                            
                                "Cannot insert explicit value for identity column in table when IDENTITY_INSERT is set to OFF" with composite key
                            
                                How to find out where reference to primary key is used in SQL Server?
                            
                                Set variable to SCOPE_IDENTITY inside of IF statement
                            
                                How to replace first and last character of column in sql server?
                            
                                Is it possible to select a specific ORDER BY in SQL Server 2008?
                            
                                how to check if the record exists before insert to prevent duplicates?
                            
                                How can i change or update password in asp.net membership via sql server
                            
                                Recursive same-table query in SQL Server 2008
                            
                                Why can't I query OFFSET/ FETCH query on my SQL Server?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With