Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How .NET Dictionary implementation works with mutable objects

I understand that it is not advisable to use "mutable" objects (objects whose GetHashCode() method can return different results while they being used as keys in a Dictionary).

Below is my understanding of how a dictionary, which is implemented as a hash table, works:

When I am adding new key, for example dict.Add(m1, "initially here was m1 object");, dict calculates the hashcode of m1 using the GetHashCode() method. Then it does some internal calculations and finally puts this object into some position of its internal array.

When I am using the key index to get the value, for example dict[m1], dict calculates the hashcode again. Then it does some internal calculations, and it gives me an object which is located at the calculated position inside of its internal array.

But I think there is an error which I can't find.

So lets assume that I have this code:

    class MutableObject
    {
         Int32 m_value;

         public MutableObject(Int32 value)
         {
             m_value = value;
         }

         public void Mutate(Int32 value)
         {
             m_value = value;
         }

         public override int GetHashCode()
         {
             return m_value;
         }
   }

   static void Main(string[] args)
   {
         MutableObject m1 = new MutableObject(1);
         MutableObject m2 = new MutableObject(2);

         var dict = new Dictionary<MutableObject, String>();
         dict.Add(m1, "initially here was m1 object");
         dict.Add(m2, "initially here was m2 object");

         Console.WriteLine("Before mutation:");
         Console.WriteLine("dict[m1] = " + dict[m1]);
         Console.WriteLine("dict[m2] = " + dict[m2]);

         m1.Mutate(2);
         m2.Mutate(1);

         Console.WriteLine("After mutation:");
         Console.WriteLine("dict[m1] = " + dict[m1]);
         Console.WriteLine("dict[m2] = " + dict[m2]);

         Console.ReadKey(true);
   }

When I call Mutate methods, keys are swapped. So I thought it will give swapped results. But actually this line: Console.WriteLine("dict[m1] = " + dict[m1]); throws KeyNotFoundException, and I can't understand why. Obviously I am missing something here...

like image 296
acrilige Avatar asked Mar 12 '13 19:03

acrilige


People also ask

Is dictionary mutable in C#?

The Dictionary<TKey,TValue> type makes no attempt to protect against the user modifying the key used. It is purely left up to the developer to be responsible in not mutating the key.

How. net dictionary works?

NET Dictionary implementation conceptually uses chaining as its collision resolution method, but it doesn't use a separate data structure (like a linked list) to store the items in the chain, it rather stores every entry in the same array. Now we add some more elements to the dictionary, then remove two of them.


3 Answers

How .NET Dictionary implementation works with mutable objects

It doesn't. The documentation for Dictionary states:

As long as an object is used as a key in the Dictionary<TKey, TValue>, it must not change in any way that affects its hash value.

Since you're changing the object while it's in the Dictionary it will not work.

As for why, it's not too hard to see. We put in an object. Let's assume that the hash code is 1. We put the object in the 1 bucket of our hash table. Now the object is mutated from outside the Dictionary so that it's value (and hash code) is 2. Now when someone gives that object to the dictionary's indexer it gets the hash code, see's that it's 2, and looks in the 2 bucket. That bucket is empty, so it says, "sorry, no element".

Now let's assume that a new object is created with a value and hash of 1. It's passed to the Dictionary, who sees that the hash is 1. It looks in the 1 bucket and finds that there is indeed an item at that index. It now uses Equals to determine if the objects are in fact equal (or if this is just a hash collision).

Now, in your case, it will fail here because you don't override Equals, you're using the default implementation which compares references, and since this is a different object it won't have the same reference. However, even if you changed it to compare the values, *the first object was mutated to have a value of 2, not 1, so it won't match anyway. Others have suggested fixing this Equals method, and you really should do that, but it still won't fix your problem.

Once the object is mutated the only way of finding it is if it just so happens that the mutated value is a hash collision (which is possible, but unlikely). If it's not, then anything that is equal according to Equals will never know to check the right bucket, and anything that checks the right bucket won't be equal according to Equals.

The quote I mentioned at the start isn't just a best practice. It's not just unexpected or weird or unperformant to mutate items in a dictionary. It just doesn't work.

Now, if the object is mutable but isn't mutated while it's in the dictionary then that's fine. It may be a bit odd, and that's a case people may say is bad practice, even if it works.

like image 76
Servy Avatar answered Oct 19 '22 00:10

Servy


It's not enough to do a dictionary lookup to have the same hash code. Since hash collisions are possible, the key must also be equal the index being looked up.

like image 21
recursive Avatar answered Oct 19 '22 00:10

recursive


Your MutableObject class doesn't override Equals(object). Hence reference equality is used (inherited from base class System.Object).

The Dictionary<,> first (quickly) finds any keys with the correct hash code. It then examines each of those candidate keys to check if one of them Equals the key it is searching for.

Therefore Equals(object) and GetHashCode() should be overridden together. You would get a warning from the compiler if you overrode only one of them.

As soon as the hash code of a key mutates while the key is in the Dictionary<,>, that key will (probably) be misplaced inside the Dictionary<,>, be in the wrong "bucket", and hence be lost. It will not be found, because search for it will always take place in a bucket where it isn't located.

In this example, the key gets lost, and therefore can be added again:

var dict = new Dictionary<MutableObject, string>();

var m = new MutableObject(1);

dict.Add(m, "Hello");

m.Mutate(2);

dict.Add(m, "world");

foreach (var p in dict)
    Console.WriteLine(p);

var otherDict = new Dictionary<MutableObject, string>(dict); // throws

I have actually seen an exception like that, during initializing of one Dictionary<,> with the items from an existing Dictionary<,> (both using the default EqualityComparer<> for the key type).

like image 36
Jeppe Stig Nielsen Avatar answered Oct 18 '22 23:10

Jeppe Stig Nielsen