INT vs Unique-Identifier for ID field in database

Tags:

I am creating a new database for a web site using SQL Server 2005 (possibly SQL Server 2008 in the near future). As an application developer, I've seen many databases that use an integer (or bigint, etc.) for an ID field of a table that will be used for relationships. But lately I've also seen databases that use the unique identifier (GUID) for an ID field.

My question is whether one has an advantage over the other? Will integer fields be faster for querying and joining, etc.?

UPDATE: To make it clear, this is for a primary key in the tables.

975

asked Jul 20 '09 03:07

mkchandler

1 Answers

GUIDs are problematic as clustered keys because of the high randomness. This issue was addressed by Paul Randal in the last Technet Magazine Q&A column: I'd like to use a GUID as the clustered index key, but the others are arguing that it can lead to performance issues with indexes. Is this true and, if so, can you explain why?

Now bear in mind that the discussion is specifically about clustered indexes. You say you want to use the column as 'ID', that is unclear if you mean it as clustered key or just primary key. Typically the two overlap, so I'll assume you want to use it as clustered index. The reasons why that is a poor choice are explained in the link to the article I mentioned above.

For non clustered indexes GUIDs still have some issues, but not nearly as big as when they are the leftmost clustered key of the table. Again, the randomness of GUIDs introduces page splits and fragmentation, be it at the non-clustered index level only (a much smaller problem).

There are many urban legends surrounding the GUID usage that condemn them based on their size (16 bytes) compared to an int (4 bytes) and promise horrible performance doom if they are used. This is slightly exaggerated. A key of size 16 can be a very peformant key still, on a properly designed data model. While is true that being 4 times as big as a int results in more a lower density non-leaf pages in indexes, this is not a real concern for the vast majority of tables. The b-tree structure is a naturally well balanced tree and the depth of tree traversal is seldom an issue, so seeking a value based on GUID key as opposed to a INT key is similar in performance. A leaf-page traversal (ie. a table scan) does not look at the non-leaf pages, and the impact of GUID size on the page size is typically quite small, as the record itself is significantly larger than the extra 12 bytes introduced by the GUID. So I'd take the hear-say advice based on 'is 16 bytes vs. 4' with a, rather large, grain of salt. Analyze on individual case by case and decide if the size impact makes a real difference: how many other columns are in the table (ie. how much impact has the GUID size on the leaf pages) and how many references are using it (ie. how many other tables will increase because of the fact they need to store a larger foreign key).

I'm calling out all these details in a sort of makeshift defense of GUIDs because they been getting a lot of bad press lately and some is undeserved. They have their merits and are indispensable in any distributed system (the moment you're talking data movement, be it via replication or sync framework or whatever). I've seen bad decisions being made out based on the GUID bad reputation when they were shun without proper consideration. But is true, if you have to use a GUID as clustered key, make sure you address the randomness issue: use sequential guids when possible.

And finally, to answer your question: if you don't have a specific reason to use GUIDs, use INTs.

196

answered Oct 04 '22 13:10

Remus Rusanu

Related questions
                            
                                Count multiple columns with group by in one query
                            
                                Detect consecutive dates ranges using SQL
                            
                                SQL Server triggers - order of execution
                            
                                Cast collation of nvarchar variables in t-sql
                            
                                Using insert into ... select results in a incorrect syntax near select, why?
                            
                                Return zero if no record is found
                            
                                ORA-01843 not a valid month- Comparing Dates
                            
                                Update only time from my Datetime field in sql
                            
                                Rails joins through association
                            
                                How to aggregate boolean column
                            
                                SQL Server CASE .. WHEN .. IN statement
                            
                                Database efficiency - table per user vs. table of users
                            
                                How to check correctly if a temporary table exists in SQL Server 2005?
                            
                                Conditional UPDATE in MySQL
                            
                                Truncate table in Oracle getting errors
                            
                                There are no Primary or Candidate Keys in the referenced table
                            
                                meta_query, how to search using both relation OR & AND?
                            
                                What are indexes and how can I use them to optimize queries in my database? [duplicate]
                            
                                TSQL left join and only last row from right
                            
                                dplyr left_join by less than, greater than condition

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

INT vs Unique-Identifier for ID field in database

Tags:

sql

sql-server

tsql

uniqueidentifier

mkchandler

People also ask

1 Answers

Remus Rusanu

Recent Activity

Donate For Us