Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What if a HashMap is full?

I know java Hashmap has a capacity and load factor parameter.So , if the number of items in this hashmap is more than capacity* load factor, a new hashmap will be reconstructed.I have some questions about the re-constructions of it:

  1. The previous hashmap will be reclaimed or still be in use if the reconstruction happened?
  2. since we need a larger size hashmap , so , will the hash function be changed ?
  3. For ConcurrentHashMap , what if one thread is inserting(Of cource, this insert operation has lead to a re-construction) and another thread is reading?For example, it will read from the old hashmap or from the new hashmap?
like image 525
wuchang Avatar asked Jan 28 '26 16:01

wuchang


2 Answers

The previous hashmap will be reclaimed or still be in use if the reconstruction happened?

It's still the same hashmap, just the internal storage is reconstructed. After reconstruction the old bucket array is not needed anymore and could be gc'ed.

Update: Internally HashMap has Node<K,V>[] table. During resize a new array will be constructed, the elements are moved and then table is replaced with the new array. After that operation the map itself would not reference the old array anymore so unless there are no other references (which is unlikely due to table being package private) it is elligible for gc.

since we need a larger size hashmap , so , will the hash function be changed ?

No, the hash function won't change. It generally doesn't depend on the number of buckets but the generated hash will be adjusted to get the correct bucket (e.g. by applying a modulo)

Update: HashMap calculates the bucket index like this: (size - 1) & hash, hash is the return value of the key's hashCode() method, which doesn't depend on the map itself.

For ConcurrentHashMap , what if one thread is inserting(Of cource, this insert operation has lead to a re-construction) and another thread is reading?For example, it will read from the old hashmap or from the new hashmap?

I'd have to guess here (I'll look up the code later) but I assume as long as threads are reading from the old buckets they'll still be in use and will be freed later.

Update: I had a quick look at the ConcurrentHashMap sources and there are reference to the current table which is used by get() and a possible nextTable which is the target for resize operations. During resize the elements are transferred to nextTable and at the end table is set to nextTable, effectively switching tables.

This means that during resizing the old table is still read and at some point it gets replaced. During insert operations there might be some blocking, e.g. by using synchronized blocks, especially if a resize is needed or already being performed.

The documentation also hints at this:

A hash table supporting full concurrency of retrievals and high expected concurrency for updates.

This means that get won't block but put, remove etc. might block at some point.

like image 56
Thomas Avatar answered Jan 31 '26 05:01

Thomas


From HashMap API

1)

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

2) If you know the size well in advance better create hashbuckets while constructing map object.

HashMap(int initialCapacity)
like image 23
csarathe Avatar answered Jan 31 '26 06:01

csarathe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!