Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does c# figure out the hash code for an object?

Tags:

c#

hash

internals

This question comes out of the discussion on tuples.

I started thinking about the hash code that a tuple should have. What if we will accept KeyValuePair class as a tuple? It doesn't override the GetHashCode() method, so probably it won't be aware of the hash codes of it's "children"... So, run-time will call Object.GetHashCode(), which is not aware of the real object structure.

Then we can make two instances of some reference type, which are actually Equal, because of the overloaded GetHashCode() and Equals(). And use them as "children" in tuples to "cheat" the dictionary.

But it doesn't work! Run-time somehow figures out the structure of our tuple and calls the overloaded GetHashCode of our class!

How does it work? What's the analysis made by Object.GetHashCode()?

Can it affect the performance in some bad scenario, when we use some complicated keys? (probably, impossible scenario... but still)

Consider this code as an example:

namespace csharp_tricks
{
    class Program
    {
        class MyClass
        {
            int keyValue;
            int someInfo;

            public MyClass(int key, int info)
            {
                keyValue = key;
                someInfo = info;
            }

            public override bool Equals(object obj)
            {
                MyClass other = obj as MyClass;
                if (other == null) return false;

                return keyValue.Equals(other.keyValue);
            }

            public override int GetHashCode()
            {
                return keyValue.GetHashCode();
            }
        }

        static void Main(string[] args)
        {
            Dictionary<object, object> dict = new Dictionary<object, object>();

            dict.Add(new KeyValuePair<MyClass,object>(new MyClass(1, 1), 1), 1);

            //here we get the exception -- an item with the same key was already added
            //but how did it figure out the hash code?
            dict.Add(new KeyValuePair<MyClass,object>(new MyClass(1, 2), 1), 1); 

            return;
        }
    }
}

Update I think I've found an explanation for this as stated below in my answer. The main outcomes of it are:

  • Be careful with your keys and their hash codes :-)
  • For complicated dictionary keys you must override Equals() and GetHashCode() correctly.
like image 717
Max Galkin Avatar asked Sep 19 '08 15:09

Max Galkin


2 Answers

Don't override GetHashcode() and Equals() on mutable classes, only override it on immutable classes or structures, else if you modify a object used as key the hash table won't function properly anymore (you won't be able to retrieve the value associated to the key after the key object was modified)

Also hash tables don't use hashcodes to identify objects they use the key objects themselfes as identifiers, it's not required that all keys that are used to add entries in a hash table return different hashcodes, but it is recommended that they do, else performance suffers greatly.

like image 185
Pop Catalin Avatar answered Oct 11 '22 17:10

Pop Catalin


Here are the proper Hash and equality implementations for the Quad tuple (contains 4 tuple components inside). This code ensures proper usage of this specific tuple in HashSets and the dictionaries.

More on the subject (including the source code) here.

Note usage of the unchecked keyword (to avoid overflows) and throwing NullReferenceException if obj is null (as required by the base method)

public override bool Equals(object obj)
{
    if (ReferenceEquals(null, obj))
        throw new NullReferenceException("obj is null");
    if (ReferenceEquals(this, obj)) return true;
    if (obj.GetType() != typeof (Quad<T1, T2, T3, T4>)) return false;
    return Equals((Quad<T1, T2, T3, T4>) obj);
}

public bool Equals(Quad<T1, T2, T3, T4> obj)
{
    if (ReferenceEquals(null, obj)) return false;
    if (ReferenceEquals(this, obj)) return true;
    return Equals(obj.Item1, Item1)
        && Equals(obj.Item2, Item2)
            && Equals(obj.Item3, Item3)
                && Equals(obj.Item4, Item4);
}

public override int GetHashCode()
{
    unchecked
    {
        int result = Item1.GetHashCode();
        result = (result*397) ^ Item2.GetHashCode();
        result = (result*397) ^ Item3.GetHashCode();
        result = (result*397) ^ Item4.GetHashCode();
        return result;
    }
}
public static bool operator ==(Quad<T1, T2, T3, T4> left, Quad<T1, T2, T3, T4> right)
{
    return Equals(left, right);
}


public static bool operator !=(Quad<T1, T2, T3, T4> left, Quad<T1, T2, T3, T4> right)
{
    return !Equals(left, right);
}
like image 43
Rinat Abdullin Avatar answered Oct 11 '22 19:10

Rinat Abdullin