Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate Unique Hash code based on String

Tags:

I have the following two strings:

var string1 = "MHH2016-05-20MASTECH HOLDINGS, INC. Financialshttp://finance.yahoo.com/q/is?s=mhhEDGAR Online FinancialsHeadlines";  var string2 = "CVEO2016-06-22Civeo upgraded by Scotia Howard Weilhttp://finance.yahoo.com/q/ud?s=CVEOBriefing.comHeadlines"; 

At first glance these two strings are different however their hashcode is the same using the GetHashCode method.

var hash = 0; var total = 0; foreach (var x in string1) //string2 {     //hash = x * 7;     hash = x.GetHashCode();     Console.WriteLine("Char: " +  x + " hash: " + hash + " hashed: " + (int) x);     total += hash; } 

Total ends up being 620438779 for both strings. Is there another method that will return a more unique hash code? I need the hashcode to be unique based on the characters in the string. Although both strings are different and the code works properly, these two strings so happen add up to being the same. How can I improve this code to make them more unique?

like image 461
some random dude Avatar asked Jun 26 '16 22:06

some random dude


People also ask

How do you make a unique string hash?

In order to create a unique hash from a specific string, it can be implemented using their own string to hash converting function. It will return the hash equivalent of a string. Also, a library named Crypto can be used to generate various types of hashes like SHA1, MD5, SHA256 and many more.

Is hash of string unique?

A Bit of TheoryA 32-bit hash can only take 2^32 = 4,294,967,296 unique values. Because a String can have any number of characters in it, there are obviously more possible String s than this.

Can two different strings have same hashCode C#?

Different strings can return the same hash code. The hash code itself is not guaranteed to be stable. Hash codes for identical strings can differ across .


2 Answers

string.GetHashCode is indeed inappropriate for real hashing:

Warning

A hash code is intended for efficient insertion and lookup in collections that are based on a hash table. A hash code is not a permanent value. For this reason:

  • Do not serialize hash code values or store them in databases.
  • Do not use the hash code as the key to retrieve an object from a keyed collection.
  • Do not use the hash code instead of a value returned by a cryptographic hashing function. For cryptographic hashes, use a class derived from the System.Security.Cryptography.HashAlgorithm or System.Security.Cryptography.KeyedHashAlgorithm class.
  • Do not test for equality of hash codes to determine whether two objects are equal. (Unequal objects can have identical hash codes.) To test for equality, call the ReferenceEquals or Equals method.

and has high possibility of duplicates.

Consider HashAlgorithm.ComputeHash. The sample is slightly changed to use SHA256 instead of MD5, as @zaph suggested:

static string GetSha256Hash(SHA256 shaHash, string input) {     // Convert the input string to a byte array and compute the hash.     byte[] data = shaHash.ComputeHash(Encoding.UTF8.GetBytes(input));      // Create a new Stringbuilder to collect the bytes     // and create a string.     StringBuilder sBuilder = new StringBuilder();      // Loop through each byte of the hashed data      // and format each one as a hexadecimal string.     for (int i = 0; i < data.Length; i++)     {         sBuilder.Append(data[i].ToString("x2"));     }      // Return the hexadecimal string.     return sBuilder.ToString(); } 
like image 176
AlexD Avatar answered Oct 05 '22 12:10

AlexD


using System.Security.Cryptography; string data="test"; byte[] hash; using (MD5 md5 = MD5.Create()) {     md5.Initialize();     md5.ComputeHash(Encoding.UTF8.GetBytes(data));     hash = md5.Hash; } 

hash is a 16 byte array, which in turn you could covert to some hex-string or base64 encoded string for storage.

EDIT:

What's the purpose of that hash code?

From hash(x) != hash(y) you can derive x!=y, but

from hash(x) == hash(y) you canNOT derive x==y in general!

like image 43
lexx9999 Avatar answered Oct 05 '22 14:10

lexx9999