I have a list (List<T> list
) and I want to index its objects by their ids using a map (HashMap<Integer, T> map
). I always use list.size()
as the initial capacity in the HashMap
constructor,like in the code below. Is this the best initial capacity to be used in this case?
Note: I'll never add more items to the map.
List<T> list = myList; Map<Integer, T> map = new HashMap<Integer, T>(list.size()); for(T item : list) { map.put(item.getId(), item); }
Initial Capacity of HashMap The initial capacity of the HashMap is the number of buckets in the hash table. It creates when we create the object of HashMap class. The initial capacity of the HashMap is 24, i.e., 16. The capacity of the HashMap is doubled each time it reaches the threshold.
Finally, the default initial capacity of the HashMap is 16. As the number of elements in the HashMap increases, the capacity is expanded. The load factor is the measure that decides when to increase the capacity of the Map. The default load factor is 75% of the capacity.
Constructs an empty HashMap with the default initial capacity (16) and the default load factor (0.75).
There is no universal answer to which is the best key, since there is no best key. The best key for use in your hash map is one that you need for retrieval. Just make sure the key's hashCode() is quick and uses the int space appropriately.
If you wish to avoid rehashing the HashMap
, and you know that no other elements will be placed into the HashMap
, then you must take into account the load factor as well as the initial capacity. The load factor for a HashMap
defaults to 0.75.
The calculation to determine whether rehashing is necessary occurs whenever an new entry is added, e.g. put
places a new key/value. So if you specify an initial capacity of list.size()
, and a load factor of 1, then it will rehash after the last put
. So to prevent rehashing, use a load factor of 1 and a capacity of list.size() + 1
.
EDIT
Looking at the HashMap
source code, it will rehash if the old size meets or exceeds the threshold, so it won't rehash on the last put
. So it looks like a capacity of list.size()
should be fine.
HashMap<Integer, T> map = new HashMap<Integer, T>(list.size(), 1.0);
Here's the relevant piece of HashMap
source code:
void addEntry(int hash, K key, V value, int bucketIndex) { Entry<K,V> e = table[bucketIndex]; table[bucketIndex] = new Entry<>(hash, key, value, e); if (size++ >= threshold) resize(2 * table.length); }
The 'capacity' keyword is incorrect by definition and isn't used in the way typically expected.
By default the 'load factor' of a HashMap is 0.75, this means that when the number of entries in a HashMap reaches 75% of the capacity supplied, it will resize the array and rehash.
For example if I do:
Map<Integer, Integer> map = new HashMap<>(100);
When I am adding the 75th entry, the map will resize the Entry table to 2 * map.size() (or 2 * table.length). So we can do a few things:
The best option is the latter of the two, let me explain what's going on here:
list.size() / 0.75
This will return list.size() + 25% of list.size(), for example if my list had a size of 100 it would return 133. We then add 1 to it as the map is resized if the size of it is equal to 75% of the initial capacity, so if we had a list with a size of 100 we would be setting the initial capacity to 134, this would mean that adding all 100 entries from the list would not incur any resize of the map.
End result:
Map<Integer, Integer> map = new HashMap<>(list.size() / 0.75 + 1);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With