Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server : why is the 15th char of a GUID always 4?

I'm just curious, I've always wondered why this is so.

In an attempt to find out if I could create one without the character 4 at the 15th character, I ran this...

DECLARE @GUID AS NVARCHAR(36)
DECLARE @COUNT AS INTEGER

SET @COUNT = 0
SET @GUID = CAST(NEWID() AS NVARCHAR(36))

WHILE SUBSTRING(@GUID,15,1) = '4'
BEGIN
    SET @COUNT = @COUNT + 1
    SET @GUID = CAST(NEWID() AS NVARCHAR(36))
END

PRINT 'Attempts : ' + CAST(@COUNT AS NVARCHAR(MAX))
PRINT @GUID

As you might guess, this never actually ended for me. I had this running on a server all weekend.

If NewID is supposed to always give a random ID, why is that 4 always there.

BC13DF1C-60FB-41C2-B5B2-8F1A73CF2485
D790D359-AB3D-4657-A864-FA89FACB3E99
DF1BBC0C-4205-48E8-A1B6-EA9544D7C6E5

Is the 15th position some kind of identify as to the system that generated the uniqueidentifier?

In fact, the same thing happens with VB.net's System.Guid.Newguid function. Is the 4 a Microsoft only thing?

Edit: Perhaps I should have also asked, are they actually unique? Can one rely one them being unique in an entire database? I know database systems based on the assumption these are guaranteed to be unique within the database. With several millions records in different tables... are any of them potentially the same?

like image 497
Elarys Avatar asked Feb 17 '12 17:02

Elarys


2 Answers

The 4 indicates that it was generated using a pseudo-random number; See Wikipedia's article for Globally Unique Identifiers under Algorithm.

like image 70
Chris Shaffer Avatar answered Sep 29 '22 13:09

Chris Shaffer


It's related to the UUID/GUID version and how it's put together. Full details on Wikipedia, summary:

In the canonical representation, xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx, the most significant bits of N indicates the variant (depending on the variant; one, two or three bits are used). The variant covered by the UUID specification is indicated by the two most significant bits of N being 1 0 (i.e. the hexadecimal N will always be 8, 9, a, or b).

In the variant covered by the UUID specification, there are five versions. For this variant, the four bits of M indicates the UUID version (i.e. the hexadecimal M will either be 1, 2, 3, 4, or 5).

...

Version 4 (random)

Version 4 UUIDs use a scheme relying only on random numbers. This algorithm sets the version number as well as two reserved bits. All other bits are set using a random or pseudorandom data source.

Version 4 UUIDs have the form xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx where x is any hexadecimal digit and y is one of 8, 9, A, or B. e.g. f47ac10b-58cc-4372-a567-0e02b2c3d479.

Essentially, that digit is the version of the UUID, which explains how it was created. 4 indicates random, so the implication is that MSSQL uses random generation (vs MAC-address based, for example).

I believe, although not sure, that most MS tools and possibly the WinAPI GUID-creation functions all create version-4 UUIDs. Glancing at a scattering of COM GUIDs, this appears to be the case.

like image 20
ssube Avatar answered Sep 29 '22 13:09

ssube