One hard working day I noticed that GUIDs I've been generating with usual .NET's Guid.NewGuid() method had the same number 4 in the beginning of the third block:
efeafa5f-fe21-4ab4-ba82-b9eefd5fa225
480b64d0-6762-4afe-8496-ac7cf3292898
397579c2-a4f4-4611-9fda-16e9c1e52d6a
...
There were ten of them appearing on the screen once a second or so. I've kept my eye on this pattern right after the fifth GUID. Finally, the last one had the same four bits inside and I've decided that I'm a lucky guy. I went home and felt that the whole world is opened for such an exceptional person as me. Next week I found a new work, cleaned my room and made a call to my parents.
But today I've faced the same pattern again. Thousand times. And I don't feel the Chosen One anymore.
I've googled it and now I know about UUID and a canonical format with 4 reserved bits for version and 2 for variant.
Here's a snippet to experiment with:
static void Main(string[] args)
{
while (true)
{
var g = Guid.NewGuid();
Console.WriteLine(BitConverter.ToString(g.ToByteArray()));
Console.WriteLine(g.ToString());
Console.ReadLine();
}
}
But still there is one thing I don't understand (except how to go on living). Why do we need these reserved bits? I see how it can harm - exposing internal implementation details, more collisions (still nothing to worry about, but one day...), more suicides - but I don't see any benefit. Can you help me to find any?

It is so that if you update the algorithm you can change that number. Otherwise 2 different algorithms could produce the exact same UUID for different reasons, leading to a collision. It is a version identifier.
For example, consider a contrived simplistic UUID format:
00000000-00000000
time - ip
now suppose we change that format for some reason to:
00000000-00000000
ip - time
This could generate a collision when a machine with IP 12.34.56.78 generates a UUID using the first method at time 01234567, and later a second machine with IP 01.23.45.67 generates a UUID at time 12345678 using the newer method. But if we reserve some bits for a version identifier, this cannot possibly cause a collision.
The value 4 specifically refers to a randomly generated UUID (therefore it relies on the miniscule chance of collisions given so many bits) rather than other methods which could use combinations of the time, mac address, pid, or other sorts of time & space identifiers to guarantee uniqueness.
See here for the relevant spec: https://www.rfc-editor.org/rfc/rfc4122#section-4.1.3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With