Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using GUID as PK in big partitioned MySQL table

We have a huge InnoDB table with hundreds of millions of rows and only 3 columns: GUID, enum, smallint. All lookups are done by GUID.

We're considering making GUID the PK and partitioning it BY KEY.

We've heard that using GUID as PK is bad because of its random distribution and the fact that PK creates clustered index. So storing rows in random order of GUIDs increases fragmentation and page splits.

The alternative to using GUID as PK is to create a surrogate auto-increment key, and use that as PK. However, if we want to partition the table by GUID, that GUID has to be part of PK as well. Also, since all queries are done by GUID, we need an additional GUID index. That index essentially maps GUID->PK, while if we use GUID as PK - the table itself maps GUID->enum+small int?

So my question is whether we gain anything by adding auto-inc PK and having additional GUID index?

Thanks, Philopator.

like image 265
Philopator Avatar asked Jun 22 '11 19:06

Philopator


People also ask

What is the difference between Guid and UUID in MySQL?

Bookmark this question. Show activity on this post. I have my MySQL database having tables where on each table, we have a ID (int, auto increment) clustered primary key and a UUID (varchar36) column that's just a GUID. The ID is there just for indexing purposes and nothing more. Joins and all queries will run against the UUID column.

Why can't I use GUIDs as primary keys in a clustered table?

GUIDs as PRIMARY KEYs and/or the clustering key. Instead, SQL Server just uses a forwarding pointer to make one extra hop (never more) to get to the data. In a clustered table, SQL Server uses the clustering key to lookup the data. As a result, this puts some strain on the clustering key that was never there before.

Can I use GUID as primary key in PK?

If you ask me, i never use GUID as database identity column of any type, including PK even if you force me to design with a shotgun at the head. Using GUID as primary key is a definitive scaling stopper, and a critical one. I recommend you check database identity and sequence option.

Are GUID values unique across tables and databases?

As mentioned earlier, GUID values are unique across tables, databases, and servers. GUIDs can be considered as global primary keys. Local primary keys are used to uniquely identify records within a table. On the other hand, GUIDs can be used to uniquely identify records across tables, databases, and servers.


1 Answers

The problem with using GUIDs as a PK in InnoDB isn't just the fact that GUID distribution is random. It's that records in InnoDB are stored in primary key order. That means in the table design you're talking about, InnoDB is going to be constantly moving data about in an effort to sort your GUIDs. You should use a translation table that maps the GUIDs to int or bigint and use that as the PK.

like image 75
Jeremiah Gowdy Avatar answered Nov 02 '22 07:11

Jeremiah Gowdy