Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Custom type GetHashCode [duplicate]

Tags:

c#

.net

Possible Duplicate:
What is the best algorithm for an overridden System.Object.GetHashCode?

I need to override GetHashCode method for a type which consists of three strings. Here is my code:

protected override int GetHashCode()
{
    return str1.GetHashCode() + str2.GetHashCode() + str3.GetHashCode();
}

What is a safe way of this method implementation?

like image 874
SiberianGuy Avatar asked Feb 20 '11 21:02

SiberianGuy


People also ask

Should I override GetHashCode?

If you're implementing a reference type, you should consider overriding the Equals method if your type looks like a base type, such as Point, String, BigNumber, and so on. Override the GetHashCode method to allow a type to work correctly in a hash table.

When should we override the GetHashCode () method?

It's my understanding that the original GetHashCode() returns the memory address of the object, so it's essential to override it if you wish to compare two different objects.

Is GetHashCode unique C#?

NO! A hash code is not an id, and it doesn't return a unique value. This is kind of obvious, when you think about it: GetHashCode returns an Int32 , which has “only” about 4.2 billion possible values, and there's potentially an infinity of different objects, so some of them are bound to have the same hash code.

Why do we need GetHashCode in C#?

This method is used to return the hash code for this instance. A hash code is a numeric value which is used to insert and identify an object in a hash-based collection. The GetHashCode method provides this hash code for algorithms that need quick checks of object equality.


2 Answers

The best way is to avoid anything that would produce the same hash code if you:

  • swapped the order of the operands
  • has a mostly-zero value and just move the non-zero value around

Both addition (by itself) and XOR fails on these accounts.

Here's a better approach:

public override int GetHashCode()
{
    unchecked
    {
        int result = 37; // prime

        result *= 397; // also prime (see note)
        if (str1 != null)
            result += str1.GetHashCode();

        result *= 397;
        if (str2 != null)
            result += str2.GetHashCode();

        result *= 397;
        if (str2 != null)
            result += str2.GetHashCode();

        return result;
    }
}

Whether you use addition or XOR inside that code is up for debate, I've seen examples using both with no clear analysis of which is superior (ie. uniform distribution). Pick one and go with it.

397 is the default value used by the ReSharper addin when it generates GetHashCode implementations, and is apparently selected because it typically overflows the range of the int and thus mixes bits a bit better. There are many theories around this particular format of GetHashCode implementation, but it's the most used one.

like image 179
Lasse V. Karlsen Avatar answered Oct 06 '22 16:10

Lasse V. Karlsen


I always use exclusive or (Xor) rather than addition, because it doesn't have a tendency to get numbers anywhere (like toward large values). So I would say that

protected override int GetHashCode()
{ return str1.GetHashCode() ^ str2.GetHashCode() ^ str3.GetHashCode(); }

is a better implementation.

You could also try a variation on it, like

protected override int GetHashCode()
{
    unchecked
    {
        return (str1.GetHashCode() * 1369) ^
               (str2.GetHashCode() * 37) ^ str3.GetHashCode();
    }
}

if you want to make sure that switching the values of the strings gives a different result. There's all sorts of methods that can be used for hashing (e.g. universal hashing) so just do a search for hashing methods if that's what you're looking for.

like image 40
user541686 Avatar answered Oct 06 '22 16:10

user541686