When constructing a HashSet
and a LinkedHashSet
from a collection, the initialCapacity
is set to different values in the default implementation.
HashSet:
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
addAll(c);
}
LinkedHashSet:
public LinkedHashSet(Collection<? extends E> c) {
super(Math.max(2*c.size(), 11), .75f, true);
addAll(c);
}
I'm sure there is a perfectly valid reason for this, but I fail to see it.
From the code you showed us, here are the specs for HashSet
and LinkedHashSet
:
data structure | initial capacity | load factor
HashSet | max(1.333 * size, 16) | 0.75
LinkedHashSet | max(2 * size, 11) | 0.75
Off the top of my head, it is probably more costly to rehash a LinkedHashSet than a plain HashSet, as the former has a linked list running through it, which might also have to be refactored/recalculated. By making the initial capacity greater, we might avoid exceeding the initial capacity for some typical use cases.
When the initial capacity of a hashtable data structure is exceeded in Java, it needs to be expanded. This requires, among other things, that every entry in the table needs to be rehashed to a new bucket. The cost of doing this should be roughly the same in both LinkedHashSet
and plain HashSet
. However, a LinkedHashSet
has an additional requirement when expanding the capacity, because it maintains a linked list running through the entries. This list might also need to be refactored in the process. Hence, I would expect the cost of expanding capacity to be higher in LinkedHashSet
than plain HashSet
. By giving LinkedHashSet
a greater initial capacity, we can avoid this costly expansion of capacity for a longer time.
LinkedHashSet Javadoc
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With