I'm a bit confused about the internal implementation of HashSet
and HashMap
in java.
This is my understanding, so please correct me if I'm wrong:
Neither HashSet
or HashMap
allow duplicate elements.
HashSet
is backed by a HashMap
, so in a HashSet
when we call .add(element)
, we are calling the hashCode()
method on the element and internally doing a put(k,v)
to the internal HashMap
, where the key is the hashCode
and the value is the actual object. So if we try to add the same object to the Set
, it will see that the hashCode
is already there, and then replace the old value by the new one.
But then, this seems inconsistent to me when I read how a HashMap
works when storing our own objects as keys in a HashMap
.
In this case we must override the hashCode()
and equals()
methods and make them consistent between each other, because, if we find keys with the same hashCode
, they will go to the same bucket, and then to distinguish between all the entries with the same hashCode
we have to iterate over the list of entries to call the method equals()
on each key and find a match.
So in this case, we allow to have the same hashCode
and we create a bucket containing a list for all the objects with the same hashCode
, however using a HashSet
, if we find already a hashCode
, we replace the old value by the new value.
I'm a bit confused, could someone clarify this to me please?
You are correct regarding the behavior of HashMap
, but you are wrong about the implementation of HashSet
.
HashSet
is backed by a HashMap
internally, but the element you are adding to the HashSet
is used as the key in the backing HashMap
. For the value, a dummy value is used. Therefore the HashSet
's contains(element)
simply calls the backing HashMap
's containsKey(element)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With