Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advantages and disadvantages of GUID / UUID database keys

I've worked on a number of database systems in the past where moving entries between databases would have been made a lot easier if all the database keys had been GUID / UUID values. I've considered going down this path a few times, but there's always a bit of uncertainty, especially around performance and un-read-out-over-the-phone-able URLs.

Has anyone worked extensively with GUIDs in a database? What advantages would I get by going that way, and what are the likely pitfalls?

like image 652
Matt Sheppard Avatar asked Sep 05 '08 08:09

Matt Sheppard


People also ask

What is the advantage of UUID?

Advantages: UUID values are unique between tables and databases. Thats why it can be merge rows between two databases or distributed databases. UUID is more safer to pass through url than integer type data.

What is the difference between GUID and UUID?

The GUID designation is an industry standard defined by Microsoft to provide a reference number which is unique in any context. UUID is a term that stands for Universal Unique Identifier. Similarly, GUID stands for Globally Unique Identifier. So basically, two terms for the same thing.

Is it good to use UUID as primary key?

Pros. Using UUID for a primary key brings the following advantages: UUID values are unique across tables, databases, and even servers that allow you to merge rows from different databases or distribute databases across servers. UUID values do not expose the information about your data so they are safer to use in a URL.

Why UUID is bad for primary key?

Primary keys should never be exposed, even UUIDsA primary key is, by definition unique within its scope. It is, therefore, an obvious thing to use as a customer number, or in a URL to identify a unique page or row. Don't! I would argue that using a PK in any public context is a bad idea.


2 Answers

Advantages:

  • Can generate them offline.
  • Makes replication trivial (as opposed to int's, which makes it REALLY hard)
  • ORM's usually like them
  • Unique across applications. So We can use the PK's from our CMS (guid) in our app (also guid) and know we are NEVER going to get a clash.

Disadvantages:

  • Larger space use, but space is cheap(er)
  • Can't order by ID to get the insert order.
  • Can look ugly in a URL, but really, WTF are you doing putting a REAL DB key in a URL!? (This point disputed in comments below)
  • Harder to do manual debugging, but not that hard.

Personally, I use them for most PK's in any system of a decent size, but I got "trained" on a system which was replicated all over the place, so we HAD to have them. YMMV.

I think the duplicate data thing is rubbish - you can get duplicate data however you do it. Surrogate keys are usually frowned upon where ever I've been working. We DO use the WordPress-like system though:

  • unique ID for the row (GUID/whatever). Never visible to the user.
  • public ID is generated ONCE from some field (e.g. the title - make it the-title-of-the-article)

UPDATE: So this one gets +1'ed a lot, and I thought I should point out a big downside of GUID PK's: Clustered Indexes.

If you have a lot of records, and a clustered index on a GUID, your insert performance will SUCK, as you get inserts in random places in the list of items (thats the point), not at the end (which is quick)

So if you need insert performance, maybe use a auto-inc INT, and generate a GUID if you want to share it with someone else (ie, show it to a user in a URL)

like image 105
Nic Wise Avatar answered Oct 09 '22 08:10

Nic Wise


Why doesn't anyone mention performance? When you have multiple joins, all based on these nasty GUIDs the performance will go through the floor, been there :(

like image 24
Andrei Rînea Avatar answered Oct 09 '22 08:10

Andrei Rînea