Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How HashSet works with regards to hashCode()?

I'm trying to understand java.util.Collection and java.util.Map a little deeper but I have some doubts about HashSet funcionality:

In the documentation, it says: This class implements the Set interface, backed by a hash table (actually a HashMap instance). Ok, so I can see that a HashSet always has a Hashtable working in background. A hashtable is a struct that asks for a key and a value everytime you want to add a new element to it. Then, the value and the key are stored in a bucket based on the key hashCode. If the hashcodes of two keys are the same, they add both key values to the same bucket, using a linkedlist. Please, correct me if I said something wrong.

So, my question is: If a HashSet always has a Hashtable acting in background, then everytime we add a new element to the HashSet using HashSet.add() method, the HashSet should add it to its internal Hashtable. But, the Hashtable asks for a value and a key, so what key does it use? Does it just uses the value we're trying to add also as a key and then take its hashCode? Please, correct me if I said something wrong about HashSet implementation.

Another question that I have is: In general, what classes can use the hashCode() method of an java object? I'm asking this because, in the documentation, it says that everytime we override equals() method we need to override hashCode() method. Ok, it really makes sense, but my doubt is if it's just a recommendation we should do to keep everything 'nice and perfect' (putting in this way), or if it's really necessary, because maybe a lot of Java defaults classes will constantly uses hashCode() method of your objects. In my vision, I can't see other classes using this method instead of those classes related to Collections. Thank you very much guys

like image 931
felipeek Avatar asked Jul 14 '14 18:07

felipeek


People also ask

How does HashSet use hashCode?

When we put an object into a HashSet, it uses the object's hashcode value to determine if an element is not in the set already. Each hash code value corresponds to a certain bucket location which can contain various elements, for which the calculated hash value is the same.

Does HashSet use equals or hashCode?

The HashSet takes advantage of hashcodes to speed things up. It assumes that two objects that equal eachother will have the same hash code. However it does not assume that two objects with the same hash code mean they are equal.

How does the default hashCode () work?

By default the hashCode() returns an integer that represents the internal memory address of the object. Where this comes in handy is in the creation and use of an important computer science data structure called a hash table.


2 Answers

If you look at the actual javacode of HashSet you can see what it does:

 // Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
...

 public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

So the element you are adding is the Key in the backing hashmap with a dummy value as the value. this dummy value is never actually used by the hashSet.

Your second question regarding overriding equals and hashcode:

It is really necessary to always override both if you want to override either one. This is because the contract for hashCode says equal objects must have the same hashcode. the default implementation of hashcode will give different values for each instance.

Therefore, if you override equals() but not hashcode() This could happen

object1.equals(object2) //true

MySet.add(object1);

MySet.contains(object2); //false but should be true if we overrode hashcode()

Since contains will use hashcode to find the bucket to search in we might get a different bucket back and not find the equal object.

like image 135
dkatzel Avatar answered Oct 05 '22 03:10

dkatzel


If you look at the source for HashSet (the source comes with the JDK and is very informative), you will see that it creates an object to use as the value:

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

Each value that is added to the HashSet is used as a key to the backing HashMap with this PRESENT object as the value.

Regarding overriding equals() whenever you override hashCode() (and vice versa), it is very important that these two methods return consistent results. That is, they should agree with one another. For more details, see the book Effective Java by Josh Bloch.

like image 22
David Conrad Avatar answered Oct 05 '22 01:10

David Conrad