"GUIDs by their very nature are non Deterministic" - this is only true of certain types ('versions') of GUIDs. However I agree that "dressing an md5 hash up as a GUID does not make a good GUID" for other reasons as spelt out by @Bradley Grainger and @Rob Fonseca-Ensor, and my answer to this question.
To Generate a GUID in Windows 10 with PowerShell, Type or copy-paste the following command: [guid]::NewGuid() . This will produce a new GUID in the output. Alternatively, you can run the command '{'+[guid]::NewGuid(). ToString()+'}' to get a new GUID in the traditional Registry format.
Version-1 UUIDs are generated from a time and a node ID (usually the MAC address); version-2 UUIDs are generated from an identifier (usually a group or user ID), time, and a node ID; versions 3 and 5 produce deterministic UUIDs generated by hashing a namespace identifier and name; and version-4 UUIDs are generated ...
It's possible to generate an identical guid over and over. However, the chances of it happening are so low that you can assume they are unique.
As mentioned by @bacar, RFC 4122 §4.3 defines a way to create a name-based UUID. The advantage of doing this (over just using a MD5 hash) is that these are guaranteed not to collide with non-named-based UUIDs, and have a very (very) small possibility of collision with other name-based UUIDs.
There's no native support in the .NET Framework for creating these, but I posted code on GitHub that implements the algorithm. It can be used as follows:
Guid guid = GuidUtility.Create(GuidUtility.UrlNamespace, filePath);
To reduce the risk of collisions with other GUIDs even further, you could create a private GUID to use as the namespace ID (instead of using the URL namespace ID defined in the RFC).
This will convert any string into a Guid without having to import an outside assembly.
public static Guid ToGuid(string src)
{
byte[] stringbytes = Encoding.UTF8.GetBytes(src);
byte[] hashedBytes = new System.Security.Cryptography
.SHA1CryptoServiceProvider()
.ComputeHash(stringbytes);
Array.Resize(ref hashedBytes, 16);
return new Guid(hashedBytes);
}
There are much better ways to generate a unique Guid but this is a way to consistently upgrading a string data key to a Guid data key.
As Rob mentions, your method doesn't generate a UUID, it generates a hash that looks like a UUID.
The RFC 4122 on UUIDs specifically allows for deterministic (name-based) UUIDs - Versions 3 and 5 use md5 and SHA1(respectively). Most people are probably familiar with version 4, which is random. Wikipedia gives a good overview of the versions. (Note that the use of the word 'version' here seems to describe a 'type' of UUID - version 5 doesn't supercede version 4).
There seem to be a few libraries out there for generating version 3/5 UUIDs, including the python uuid module, boost.uuid (C++) and OSSP UUID. (I haven't looked for any .net ones)
You need to make a distinction between instances of the class Guid
, and identifiers that are globally unique. A "deterministic guid" is actually a hash (as evidenced by your call to provider.ComputeHash
). Hashes have a much higher chance of collisions (two different strings happening to produce the same hash) than Guid created via Guid.NewGuid
.
So the problem with your approach is that you will have to be ok with the possibility that two different paths will produce the same GUID. If you need an identifier that's unique for any given path string, then the easiest thing to do is just use the string. If you need the string to be obscured from your users, encrypt it - you can use ROT13 or something more powerful...
Attempting to shoehorn something that isn't a pure GUID into the GUID datatype could lead to maintenance problems in future...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With