Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GUIDs as Primary Keys - Offline OLTP

We are working on designing an application that is typically OLTP (think: purchasing system). However, this one in particular has the need that some users will be offline, so they need to be able to download the DB to their machine, work on it, and then sync back once they're on the LAN.

I would like to note that I know this has been done before, I just don't have experience with this particular model.

One idea I thought about was using GUIDs as table keys. So for example, a Purchase Order would not have a number (auto-numeric) but a GUID instead, so that every offline client can generate those, and I don't have clashes when I connect back to the DB.

Is this a bad idea for some reason? Will access to these tables through the GUID key be slow?

Have you had experience with these type of systems? How have you solved this problem?

Thanks!
Daniel

like image 933
Daniel Magliola Avatar asked Sep 02 '08 18:09

Daniel Magliola


People also ask

Is it safe to use GUID as primary key?

Having a guid column is perfectly ok like any varchar column as long as you do not use it as PK part and in general as a key column to join tables. Your database must have its own PK elements, filtering and joining data using them - filtering also by a GUID afterwards is perfectly ok.

Why you should never use GUIDs as part of Clustered Index?

The problem with clustered indexes in a GUID field are that the GUIDs are random, so when a new record is inserted, a significant portion of the data on disk has to be moved to insert the records into the middle of the table.

Are GUIDs sequential?

Sequential GUIDs are not actually sequential. In normal circumstances, GUIDs being generated by the same computer will have gradually increasing Timestamp fields (with the other fields remaining constant).

Which option is used to return the globally unique identifier or GUID column from a table?

Using NEWSEQUENTIALID also helps to completely fill the data and index pages. Its return type is Unique Identifier. The NEWSEQUENTIALID () can only be used with DEFAULT constraints on table columns of type UNIQUEIDENTIFIER.


2 Answers

Using Guids as primary keys is acceptable and is considered a fairly standard practice for the same reasons that you are considering them. They can be overused which can make things a bit tedious to debug and manage, so try to keep them out of code tables and other reference data if at all possible.

The thing that you have to concern yourself with is the human readable identifier. Guids cannot be exchanged by people - can you imagine trying to confirm your order number over the phone if it is a guid? So in an offline scenario you may still have to generate something - like a publisher (workstation/user) id and some sequence number, so the order number may be 123-5678 -.

However this may not satisfy business requirements of having a sequential number. In fact regulatory requirements can be and influence - some regulations (SOX maybe) require that invoice numbers are sequential. In such cases it may be neccessary to generate a sort of proforma number which is fixed up later when the systems synchronise. You may land up with tables having OrderId (Guid), OrderNo (int), ProformaOrderNo (varchar) - some complexity may creep in.

At least having guids as primary keys means that you don't have to do a whole lot of cascading updates when the sync does eventually happen - you simply update the human readable number.

like image 155
Simon Munro Avatar answered Sep 28 '22 17:09

Simon Munro


@SqlMenace

There are other problems with GUIDs, you see GUIDs are not sequential, so inserts will be scattered all over the place, this causes page splits and index fragmentation

Not true. Primary key != clustered index.

If the clustered index is another column ("inserted_on" springs to mind) then the inserts will be sequential and no page splits or excessive fragmentation will occur.

like image 22
Portman Avatar answered Sep 28 '22 17:09

Portman