Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash code as key in keyed collection

Tags:

c#

.net

hash

As far as I (thought to) know, a Dictionary is implemented as a hashtable, where the hash code is used to identify a bucket, which is then searched for the key.

In my opinion, this implies that the hash code of an object remains stable during a single run of my program (loosely speaking).

Now, here

http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx

I read

"A hash code is intended for efficient insertion and lookup in collections that are based on a hash table. A hash code is not a permanent value. For this reason: [...] Do not use the hash code as the key to retrieve an object from a keyed collection."

Can anybody explain to me what that is supposed to mean?

like image 319
JohnB Avatar asked Jun 03 '14 22:06

JohnB


People also ask

What is a hash code used for?

A hash code is an integer value that is associated with each object in Java. Its main purpose is to facilitate hashing in hash tables, which are used by data structures like HashMap.

How do I find the hashCode of a key?

GetHash(Object) method is used to get the hashcode of the specified key of a Hashtable object. This method is inherited from the Object Class. Syntax: protected virtual int GetHash(Object Key);

Why GetHashCode is used in C#?

The GetHashCode method provides this hash code for algorithms that need quick checks of object equality. For information about how hash codes are used in hash tables and for some additional hash code algorithms, see the Hash Function entry in Wikipedia. Two objects that are equal return hash codes that are equal.


2 Answers

When the documentation talks about a "keyed collection", they do not mean the same thing as a Dictionary. For insight into what it actually means, note that there is actually a KeyedCollection base class: http://msdn.microsoft.com/en-us/library/ms132438%28v=vs.110%29.aspx

The key paragraph is this:

Unlike dictionaries, an element of KeyedCollection<TKey, TItem> is not a key/value pair; instead, the entire element is the value and the key is embedded within the value. For example, an element of a collection derived from KeyedCollection<String,String> (KeyedCollection(Of String, String) in Visual Basic) might be "John Doe Jr." where the value is "John Doe Jr." and the key is "Doe"; or a collection of employee records containing integer keys could be derived from KeyedCollection<int,Employee>. The abstract GetKeyForItem method extracts the key from the element.

So a keyed collection is a collection of objects along with a way of extracting a key from each one. Conceptually this is similar to a table in a database, where you can define a primary key which is a subset of the entire record.

So with this in mind, the answer becomes relatively clear. As others have said, equality of hash code does not imply equality of the objects. But keys in a keyed collection- like primary keys in a database table- should uniquely identify the exact object. So the possibility of hash collisions makes them inappropriate for this purpose.

Also, even in a Dictionary, there's an important difference between using objects as keys and using the same objects' hash codes as the key. If two objects have a hash collision but do not compare as equal, the Dictionary will still store them as two separate keys. That's why overriding GetHashCode to just return 1 is always valid (though obviously not good for performance). As a demonstration:

var dict = new Dictionary<MyClass, string>();
var hashDict = new Dictionary<int, string>();

dict[myObj1] = "One";
hashDict[myObj1.GetHashCode()] = "One";
dict[myObj2] = "Two";
hashDict[myObj2.GetHashCode()] = "Two";

Console.Out.WriteLine(dict[myObj1]);  //Outputs "One"
Console.Out.WriteLine(hashDict[myObj1.GetHashCode()]); //Outputs "Two"

(myObj1 and myObj2 are instances of MyClass which have the same hash code but do not compare as equal)

like image 169
Ben Aaronson Avatar answered Sep 19 '22 17:09

Ben Aaronson


They might be talking about KeyedCollection.
In that case there is no purpose to using a hash as the key.
They key is supposed to be real value used by the class.

enter link description here

Like in the example

public class SimpleOrder : KeyedCollection<int, OrderItem>
{
    // The parameterless constructor of the base class creates a  
    // KeyedCollection with an internal dictionary. For this code  
    // example, no other constructors are exposed. 
    // 
    public SimpleOrder() : base() {}

    // This is the only method that absolutely must be overridden, 
    // because without it the KeyedCollection cannot extract the 
    // keys from the items. The input parameter type is the  
    // second generic type argument, in this case OrderItem, and  
    // the return value type is the first generic type argument, 
    // in this case int. 
    // 
    protected override int GetKeyForItem(OrderItem item)
    {
        // In this example, the key is the part number. 
        return item.PartNumber;
    }
}

PartNumber is a property of OrderItem (that happens to be an int)
You should never use the Hash OrderItem as the GetKeyForItem

like image 35
paparazzo Avatar answered Sep 20 '22 17:09

paparazzo