Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should I define an hash code function for my types?

Tags:

java

c#

Is there any other reason for implementing an hash code function for my types other than allowing for good use of hash tables?

Let's say I am designing some types that I intend to use internally. I know that types are "internal" to the system, and I also know I will never use those types in hash tables. In spite of this, I decide I will have to redefine the equals() method.

Theory says I should also redefine the hash code method, but I can't see any reason why, in this case, I should do it.

Can anyone point me out any other reason?

This question can be rephrased to : in which situations should we implement a hash code method in our types.

PS : I am not asking how to implement one. I am asking when.

like image 914
devoured elysium Avatar asked Jun 13 '10 21:06

devoured elysium


People also ask

When should a hashing function be applied?

Hash functions are used for data integrity and often in combination with digital signatures. With a good hash function, even a 1-bit change in a message will produce a different hash (on average, half of the bits change). With digital signatures, a message is hashed and then the hash itself is signed.

What is the rule for choosing hash function?

Choosing a good hashing function, h(k), is essential for hash-table based searching. h should distribute the elements of our collection as uniformly as possible to the "slots" of the hash table. The key criterion is that there should be a minimum number of collisions. will provide uniform hashing.

What are the most important considerations when designing a hash function?

In general, a hash function should depend on every single bit of the key, so that two keys that differ in only one bit or one group of bits (regardless of whether the group is at the beginning, end, or middle of the key or present throughout the key) hash into different values.

How do you define a good hash function?

A good hash function should map the expected inputs as evenly as possible over its output range. That is, every hash value in the output range should be generated with roughly the same probability.


2 Answers

You might not - but will any of your code, for example, use LINQ? There are a number of unexpected places that might use a hashmap or dictionary on your data.

If you don't want unexpected... "fun", then if you change Equals, override GetHashCode. Likewise, any IEquatable<T>.Equals should match the object.Equals implementation.

like image 89
Marc Gravell Avatar answered Nov 15 '22 20:11

Marc Gravell


Yes, definitely. hashCode and equals are 2 views on the same thing and have to be consistent. Many routines in the Collections use the hashcode and start misbehaving if it tells different things than equals. You can read 'misbehaving' as 'incredibly hard to find bugs which lead to early loss of hair'.

If you override Equals, you must Override hashcode, not because the guideline says so, but because you value your hair (or time).

Modern IDE's generate good equals/hashcode for you and the EqualsBuilder/HashCodeBuilder from Java Commons or Spring can help make it easier. Project Lombok generates them on-the fly.

This is serious stuff, and the best you can do with these methods is get it right, and there are hundreds of ways of doing it wrong, leading to pain and agony. If you can avoid writing them yourself, do so, use on of the generators or libraries to help you.

like image 21
Peter Tillemans Avatar answered Nov 15 '22 19:11

Peter Tillemans