Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Java implement hash tables?

Tags:

java

algorithm

Does anyone know how Java implements its hash tables (HashSet or HashMap)? Given the various types of objects that one may want to put in a hash table, it seems very difficult to come up with a hash function that would work well for all cases.

like image 652
escalon Avatar asked Oct 29 '09 23:10

escalon


People also ask

How is a hash table implemented?

Hashing is implemented in two steps: An element is converted into an integer by using a hash function. This element can be used as an index to store the original element, which falls into the hash table. The element is stored in the hash table where it can be quickly retrieved using hashed key.

What is hashing concept and how is it implemented in Java?

Hashing is the process of mapping the data to some representative integer value using the concept of hashing algorithms. In Java, a hash code is an integer value that is linked with each object. Hashing finds its data structure implementation in HashTables and HashMaps.

What data structure is used to implement a hash table in Java?

Hashtable as a data structure Hashtable is a data structure where data is stored in an array format. Every data value has a unique key value. If the key is known, access to the needed data is very fast. So, insertion and search operations are fast independently on the data size.

What hash algorithm does Java use?

In the case of Java objects the output is a 32-bit signed integer. Java's Hashtable use the hash value as an index into an array where the actual object is stored, taking modulo arithmetic and collisions into account.


2 Answers

HashMap and HashSet are very similar. In fact, the second contains an instance of the first.

A HashMap contains an array of buckets in order to contain its entries. Array size is always powers of 2. If you don't specify another value, initially there are 16 buckets.

When you put an entry (key and value) in it, it decides the bucket where the entry will be inserted calculating it from its key's hashcode (hashcode is not its memory address, and the the hash is not a modulus). Different entries can collide in the same bucket, so they'll be put in a list.

Entries will be inserted until they reach the load factor. This factor is 0.75 by default, and is not recommended to change it if you are not very sure of what you're doing. 0.75 as load factor means that a HashMap of 16 buckets can only contain 12 entries (16*0.75). Then, an array of buckets will be created, doubling the size of the previous. All entries will be put again in the new array. This process is known as rehashing, and can be expensive.

Therefore, a best practice, if you know how many entries will be inserted, is to construct a HashMap specifying its final size:

new HashMap(finalSize);
like image 153
sinuhepop Avatar answered Sep 23 '22 13:09

sinuhepop


You can check the source of HashMap, for example.

like image 25
João Silva Avatar answered Sep 22 '22 13:09

João Silva