Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sequential GUIDs

Tags:

c#

guid

windows

dll

I hope someone can answer this question.

How does the UuidCreateSequential method in the rpcrt4.dll class use to seed it's guids?

I know this much: Microsoft changed the UuidCreate function so it no longer uses the machine's MAC address as part of the UUID. Since CoCreateGuid calls UuidCreate to get its GUID, its output also changed. If you still like the GUIDs to be generated in sequential order (helpful for keeping a related group of GUIDs together in the system registry), you can use the UuidCreateSequential function.

The reason behind the question is. If I use this function to generate sequential GUIDs in a web cluster, how can I ensure that the GUIDs are close to a range of GUIDs without the potential of the GUID being duplicated?

like image 929
Sean Avatar asked Apr 07 '11 17:04

Sean


People also ask

Are GUIDs sequential?

Sequential GUIDs are not actually sequential.

How does NEWSEQUENTIALID work?

The NEWSEQUENTIALID() function creates a GUID that is greater than any GUID previously generated by this function on a specified computer since Windows was started. After restarting Windows, the GUID can start again from a lower range. Using the NEWSEQUENTIALID() function can be faster than using the NEWID() function.

What is comb GUID?

The COMB GUID (with embed date and time) becomes a sequential GUID, with each GUID being sequentially after the previous GUID. This works great for indexing and sorting. But you cant just replace or insert this time stamp anywhere in the GUID, any part of the GUID with the date/time, it depends on the Database Server.


1 Answers

The Win32 UuidCreateSequential creates a Version 1 uuid.

Here's some sample version 1 uuid's created on my computer using UuidCreateSequential:

GuidToString                            Raw bytes
======================================  =================================================
{1BE8D85D-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 5D 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{1BE8D85E-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 5E 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{1BE8D85F-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 5F 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{1BE8D860-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 60 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{1BE8D861-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 61 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{1BE8D862-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 62 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{1BE8D863-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 63 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{1BE8D864-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 64 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{1BE8D865-63D1-11E1-80DB-B8AC6FBE26E1}  1B E8 D8 65 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1
{220FB46C-63D1-11E1-80DB-B8AC6FBE26E1}  22 0F B4 6C 63 D1 11 E1  80 DB  B8 AC 6F BE 26 E1

The first thing that's important to note that these uuid contain my machine's MAC address (B8AC6FBE26E1):

enter image description here

                        Node
======================= ============
1BE8D85D-63D1-11E1-80DB B8AC6FBE26E1
1BE8D85E-63D1-11E1-80DB B8AC6FBE26E1
1BE8D85F-63D1-11E1-80DB B8AC6FBE26E1
1BE8D860-63D1-11E1-80DB B8AC6FBE26E1
1BE8D861-63D1-11E1-80DB B8AC6FBE26E1
1BE8D862-63D1-11E1-80DB B8AC6FBE26E1
1BE8D863-63D1-11E1-80DB B8AC6FBE26E1
1BE8D864-63D1-11E1-80DB B8AC6FBE26E1
1BE8D865-63D1-11E1-80DB B8AC6FBE26E1
220FB46C-63D1-11E1-80DB B8AC6FBE26E1

So if you're hoping for different computers to generate guid's that are "close" to each other, you're going to be disappointed.

Let's look at the rest of the values.

Seven and a half bytes of the remaining 10 bytes are a timestamp; the number of 100ns intervals since 00:00:00 15 October 1582. Rearranging those timestamp bytes together:

Timestamp              Node
=============== ====== ============
1E163D11BE8D85D 1-80DB B8AC6FBE26E1
1E163D11BE8D85E 1-80DB B8AC6FBE26E1
1E163D11BE8D85F 1-80DB B8AC6FBE26E1
1E163D11BE8D860 1-80DB B8AC6FBE26E1
1E163D11BE8D861 1-80DB B8AC6FBE26E1
1E163D11BE8D862 1-80DB B8AC6FBE26E1
1E163D11BE8D863 1-80DB B8AC6FBE26E1
1E163D11BE8D864 1-80DB B8AC6FBE26E1
1E163D11BE8D865 1-80DB B8AC6FBE26E1
1E163D1220FB46C 1-80DB B8AC6FBE26E1

You can see that guid's created on the same machine by UuidCreateSequential will be together, as they are chronological.


The 1 you see is the version number, in this case meaning a time based uuid. There are 5 defined versions:

  • 1: time based version (UuidCreateSequential)
  • 2: DCE Security version, with embedded POSIX UIDs
  • 3: Name-based version that uses MD5 hashing
  • 4: Randomly or pseudo-randomly generated version (UuidCreate)
  • 5: Name-based version that uses SHA-1 hashing
  • 6: The version RFC 4122 forgot (unofficial)

Giving:

Timestamp       Version      Node
=============== ======= ==== ============
1E163D11BE8D85D 1       80DB B8AC6FBE26E1
1E163D11BE8D85E 1       80DB B8AC6FBE26E1
1E163D11BE8D85F 1       80DB B8AC6FBE26E1
1E163D11BE8D860 1       80DB B8AC6FBE26E1
1E163D11BE8D861 1       80DB B8AC6FBE26E1
1E163D11BE8D862 1       80DB B8AC6FBE26E1
1E163D11BE8D863 1       80DB B8AC6FBE26E1
1E163D11BE8D864 1       80DB B8AC6FBE26E1
1E163D11BE8D865 1       80DB B8AC6FBE26E1
1E163D1220FB46C 1       80DB B8AC6FBE26E1

The last word contains two things.

The lower 12 bits is the machine-specifc Clock Sequence number:

Timestamp       Version   Clock Sequence   Node
=============== ======= = ================ ============
1E163D11BE8D85D 1       8 0DB              B8AC6FBE26E1
1E163D11BE8D85E 1       8 0DB              B8AC6FBE26E1
1E163D11BE8D85F 1       8 0DB              B8AC6FBE26E1
1E163D11BE8D860 1       8 0DB              B8AC6FBE26E1
1E163D11BE8D861 1       8 0DB              B8AC6FBE26E1
1E163D11BE8D862 1       8 0DB              B8AC6FBE26E1
1E163D11BE8D863 1       8 0DB              B8AC6FBE26E1
1E163D11BE8D864 1       8 0DB              B8AC6FBE26E1
1E163D11BE8D865 1       8 0DB              B8AC6FBE26E1
1E163D1220FB46C 1       8 0DB              B8AC6FBE26E1

This machine-wide persistent value is incremented if:

  • you switched network cards
  • you generated a UUID less than 100 ns from the last one (and the timestamp would collide)

So, again, any guid's created by UuidCreateSequential will (ideally) have the same Clock Sequence number, making them "near" to each other.

The final 2 bits, is called a Variant, and is always set to binary 10:

Timestamp       Version Variant Clock Sequence   Node
=============== ======= ======= ================ ============
1E163D11BE8D85D 1       8       0DB              B8AC6FBE26E1
1E163D11BE8D85E 1       8       0DB              B8AC6FBE26E1
1E163D11BE8D85F 1       8       0DB              B8AC6FBE26E1
1E163D11BE8D860 1       8       0DB              B8AC6FBE26E1
1E163D11BE8D861 1       8       0DB              B8AC6FBE26E1
1E163D11BE8D862 1       8       0DB              B8AC6FBE26E1
1E163D11BE8D863 1       8       0DB              B8AC6FBE26E1
1E163D11BE8D864 1       8       0DB              B8AC6FBE26E1
1E163D11BE8D865 1       8       0DB              B8AC6FBE26E1
1E163D1220FB46C 1       8       0DB              B8AC6FBE26E1

So there you have it. Sequential guid's are sequential; and if you create them on the same machine they will be "near" to each other in a database.


But you want to know what actually happens with two sequential UUID's created on different computers.

Using our newfound knowledge of Version 1 guids, let's construct two guid's for the same timestamp from different machines, e.g.:

{1BE8D85D-63D1-11E1-80DB-B8AC6FBE26E1}
{1BE8D85D-63D1-11E1-80DB-123456789ABC}

First let's insert a bunch of guid's with sequential timestamps. First create a temporary table to store our guid's in, and cluster by the guid:

--DROP table #uuidOrderingTest
CREATE TABLE #uuidOrderingTest
( 
    uuid uniqueidentifier not null
)

CREATE clustered index IX_uuidorderingTest_uuid ON #uuidOrderingTest 
( 
   uuid
)

Now insert the data:

INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D866-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D862-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D861-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D85E-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D864-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D863-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D85F-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D85D-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D865-63D1-11E1-80DB-B8AC6FBE26E1}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D860-63D1-11E1-80DB-B8AC6FBE26E1}')

Note: i insert them in random timestamp order, to illustrate that SQL Server will cluster them.

Get the rows back and see what order they're in sequential (timestamp) order:

SELECT * FROM #uuidOrderingTest

uuid
------------------------------------
1BE8D85D-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D85E-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D85F-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D860-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D861-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D862-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D863-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D864-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D865-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D866-63D1-11E1-80DB-B8AC6FBE26E1

Now lets insert guid's with:

  • the same timestamps
  • but different node (i.e. MAC address):

Insert the new guids from a "different" computer:

INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D866-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D862-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D861-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D85E-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D864-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D863-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D85F-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D85D-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D865-63D1-11E1-80DB-123456789ABC}')
INSERT INTO #uuidOrderingTest (uuid) VALUES ('{1BE8D860-63D1-11E1-80DB-123456789ABC}')

And get the results:

uuid
------------------------------------
1BE8D85D-63D1-11E1-80DB-123456789ABC
1BE8D85E-63D1-11E1-80DB-123456789ABC
1BE8D85F-63D1-11E1-80DB-123456789ABC
1BE8D860-63D1-11E1-80DB-123456789ABC
1BE8D861-63D1-11E1-80DB-123456789ABC
1BE8D862-63D1-11E1-80DB-123456789ABC
1BE8D863-63D1-11E1-80DB-123456789ABC
1BE8D864-63D1-11E1-80DB-123456789ABC
1BE8D865-63D1-11E1-80DB-123456789ABC
1BE8D866-63D1-11E1-80DB-123456789ABC
1BE8D85D-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D85E-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D85F-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D860-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D861-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D862-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D863-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D864-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D865-63D1-11E1-80DB-B8AC6FBE26E1
1BE8D866-63D1-11E1-80DB-B8AC6FBE26E1

So there you have it. SQL Server order's Node before Timestamp. Uuid created from different machines will not be clustered together. Would have been better if it hadn't done so, but whatcha gonna do.

like image 82
Ian Boyd Avatar answered Oct 12 '22 09:10

Ian Boyd