Possible Duplicate:
What is the best algorithm for an overridden System.Object.GetHashCode?
I need to override GetHashCode method for a type which consists of three strings. Here is my code:
protected override int GetHashCode()
{
return str1.GetHashCode() + str2.GetHashCode() + str3.GetHashCode();
}
What is a safe way of this method implementation?
If you're implementing a reference type, you should consider overriding the Equals method if your type looks like a base type, such as Point, String, BigNumber, and so on. Override the GetHashCode method to allow a type to work correctly in a hash table.
It's my understanding that the original GetHashCode() returns the memory address of the object, so it's essential to override it if you wish to compare two different objects.
NO! A hash code is not an id, and it doesn't return a unique value. This is kind of obvious, when you think about it: GetHashCode returns an Int32 , which has “only” about 4.2 billion possible values, and there's potentially an infinity of different objects, so some of them are bound to have the same hash code.
This method is used to return the hash code for this instance. A hash code is a numeric value which is used to insert and identify an object in a hash-based collection. The GetHashCode method provides this hash code for algorithms that need quick checks of object equality.
The best way is to avoid anything that would produce the same hash code if you:
Both addition (by itself) and XOR fails on these accounts.
Here's a better approach:
public override int GetHashCode()
{
unchecked
{
int result = 37; // prime
result *= 397; // also prime (see note)
if (str1 != null)
result += str1.GetHashCode();
result *= 397;
if (str2 != null)
result += str2.GetHashCode();
result *= 397;
if (str2 != null)
result += str2.GetHashCode();
return result;
}
}
Whether you use addition or XOR inside that code is up for debate, I've seen examples using both with no clear analysis of which is superior (ie. uniform distribution). Pick one and go with it.
397 is the default value used by the ReSharper addin when it generates GetHashCode implementations, and is apparently selected because it typically overflows the range of the int and thus mixes bits a bit better. There are many theories around this particular format of GetHashCode implementation, but it's the most used one.
I always use exclusive or (Xor) rather than addition, because it doesn't have a tendency to get numbers anywhere (like toward large values). So I would say that
protected override int GetHashCode()
{ return str1.GetHashCode() ^ str2.GetHashCode() ^ str3.GetHashCode(); }
is a better implementation.
You could also try a variation on it, like
protected override int GetHashCode()
{
unchecked
{
return (str1.GetHashCode() * 1369) ^
(str2.GetHashCode() * 37) ^ str3.GetHashCode();
}
}
if you want to make sure that switching the values of the strings gives a different result. There's all sorts of methods that can be used for hashing (e.g. universal hashing) so just do a search for hashing methods if that's what you're looking for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With