What load factor should I use when I really know the maximum possible no of elements in a HashSet ? I had heard that the default load factor of 0.75 is recommended as it offers good performance trade-offs between speed & space. Is this correct ? However a larger size HashSet would also takes more time in creation and more space.
I am using HashSet just inorder to remove duplicate integers from a list of integers.
Constructs a new, empty set; the backing HashMap instance has the specified initial capacity and default load factor, which is 0.75 .
The load factor is a measure of how full the HashSet is allowed to get before its capacity is automatically increased.
Overview. Load factor is defined as (m/n) where n is the total size of the hash table and m is the preferred number of entries which can be inserted before a increment in size of the underlying data structure is required.
Constructs an empty HashMap with the default initial capacity (16) and the default load factor (0.75).
I spent some time playing around with load factors once, and it is shocking how little difference that setting really makes in practice. Even setting it to something high like 2.0 doesn't slow things down much, nor does it save that much memory. Just pretend it doesn't exist. Josh has often regretted ever exposing it as an option at all.
For your stated problem, instead of using a HashSet, you might also consider a BitSet
Depending on the range and sparsity of your integers, you may get better performance and space characteristics.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With