Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why did the language designers of Java preferred chaining over open addressing for most hash based structures except for some like ThreadLocal? [closed]

I know the difference between Open Addressing and Chaining for resolving hash collisions . Most of the basic hash based data structures like HashSet,HashMap in Java primarily use chaining technique. I read that ThreadLocal actually uses a probing scheme . So I want to understand why is open addressing not so much used in Java ? I mean it would be difficult to delete records using that scheme , in the sense that you have to mark those cells with some special handling . However it seems like memory requirement will be low for open addressing scheme.

Edit : I just want to understand the possible major reason/reasons for this design decision . I do not want finer details . Also I would like to know why ThreadLocal uses the lesser common technique of open addressing . I guess the two answers can be related together . So I prefer to ask in the same question itself.

like image 457
Geek Avatar asked Aug 18 '12 14:08

Geek


People also ask

Why is open addressing better than chaining?

Open addressing is used when the frequency and number of keys is known. 5. Cache performance of chaining is not good as keys are stored using linked list. Open addressing provides better cache performance as everything is stored in the same table.

What is the advantage of hashing with chaining?

What is the advantage of hashing with chaining? Explanation: Hashing with separate chaining has an advantage that it is less sensitive to a hash function. It is also easy to implement.

Does Java HashMap use chaining or open addressing?

Most of the basic hash based data structures like HashSet , HashMap in Java primarily use chaining technique.

What is hash collision in Hashtable and how it is handled in Java?

A collision, or more specifically, a hash code collision in a HashMap, is a situation where two or more key objects produce the same final hash value and hence point to the same bucket location or array index.


1 Answers

I am currently discussing memory-compact reimplementations of HashMap and HashSet with, among others, Doug Lea. This particular question hasn't come up, but here's my first thoughts on the question...

  • Chained hash tables degrade reasonably gracefully. Whether it's higher load factors or lots of hash collisions, chaining doesn't degrade nearly as quickly as open addressing can.
  • As you've said, remove is...not a pleasant operation on open-addressed tables. As a general rule, remove is the least common operation on hash tables, but there are applications for which it's more common, and bad performance would be noticed.
  • I also suspect -- though I don't have much data -- that implementing a "linked" open-addressed hash table would be noticeably more difficult. LinkedHashMap is written as a subclass of HashMap, and borrows most of the implementation details; it's somewhat easier to implement the linked list of entries when the entries are discrete objects -- and at that point, you're already most of the way to a chained implementation.
  • Nothing in the spec ties them to this implementation -- they're always free to mess around with it later.
  • The JDK collections libraries...don't make memory consumption an especially high priority. Memory is cheap. (You may or may not agree with this, but it's definitely a noticeable trend.)
like image 135
Louis Wasserman Avatar answered Oct 23 '22 12:10

Louis Wasserman