Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is creating a Guid out of an MD5 hash instead of String valid?

Tags:

c#

.net

hash

md5

I am trying to implement a method for detecting duplicate files. I have an MD5 hashing method (let's ignore the fact that MD5 is broken) as below:

using(MD5 hasher = MD5.Create())
using(FileStream fs = File.OpenRead("SomeFile"))
{
    byte[] hashBytes = hasher.ComputeHash(fs);
    string hashString = string.Join(string.Empty, hashBytes.Select(x => x.ToString("X2"))); 
}

Instead of creating a string out of the hashBytes can I simply create a Guid out of it like so?

Guid hashGuid = new Guid(hashBytes);

Would it still be valid or will I lose uniqueness?

like image 801
MaYaN Avatar asked Apr 05 '19 10:04

MaYaN


1 Answers

MD5 hashes and Guid essentially both express 128 bits of binary, so:

  • plus: you won't lose any uniqueness
  • plus: the fact that Guid is a value-type means that you avoids an allocation compared to string...
  • minus: ... but if you're going to display it anywhere, you might actually end up allocating multiple strings (i.e. rendering the same Guid multiple times)
  • minus: there is a semantic meaning to Guid that won't really be respected/expected here
  • minus: Guid default formatting isn't the same as how MD5 hashes are usually expressed
  • minus: Guid endianness is a mess, so if you want to get between raw bytes and any text representation: tread very carefully; it is not what you expect
like image 128
Marc Gravell Avatar answered Oct 26 '22 23:10

Marc Gravell