Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ID Generation for Sharded Database (Azure Federated Database)

I have been looking for some articles or guidence on best practice for id generation (for the federated/primary key) for Azure Federated databases and haven't found anything compelling. Federated tables don't support identity columns, so it seems to me that the only practical type of id is a GUID, as trying to centrally create and use a BigInt creates a single point of failure in the app. My chief concern is the performance implications of using GUIDs over BigInts (particularly for indexing the tables).

Is there any recommended/best practice (or existing libraries) to create unique BigInts for a distributed system (or should I not worry about the performance implications of using GUIDs?).

[Update]

Having read a lot more about this since posting the question, it seems to me that key generation will be a problem in Azure. As per the this blog post from Microsoft, it is recommended to use GUIDs as Federated key. However they fail to mention that all indexes (including clustered indexes) on Federated tables have to contain the federated key. This means that all these indexes will contain a GUID, which is going to kill insert performance.

The alternative seems to be using a centeralized key generation service (as mentioned by Simon below) which has its own drawbacks in terms of being a potential bottle neck and central point of failure.

I would have thought there would have been more guidence on this from Microsoft as surely this is an issue that everyone creating federated tables will face!

On balance I have decided to go with a centeralized key generation service, but it does concern me a little bit. If anyone has some magic technique I would love to hear it (or let me know if I am missing something obvious) !

like image 504
Mike Hanrahan Avatar asked Feb 16 '12 22:02

Mike Hanrahan


1 Answers

You could create sequences in the application using a variety of techniques, but they are not straightforward because of the distributed nature. One that is quite good is using blob storage and preconditions.

Depending on your project schedule you may want to use the SQL 2012 SEQUENCE and put all your sequences in a small non-federated database. SEQUENCE is not available yet on SQL Azure.

like image 158
Simon Munro Avatar answered Oct 07 '22 18:10

Simon Munro