Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do HashTables deal with collisions?

I've heard in my degree classes that a HashTable will place a new entry into the 'next available' bucket if the new Key entry collides with another.

How would the HashTable still return the correct Value if this collision occurs when calling for one back with the collision key?

I'm assuming that the Keys are String type and the hashCode() returns the default generated by say Java.

If I implement my own hashing function and use it as part of a look-up table (i.e. a HashMap or Dictionary), what strategies exist for dealing with collisions?

I've even seen notes relating to prime numbers! Information not so clear from Google search.

like image 876
Alex Avatar asked Feb 12 '11 21:02

Alex


People also ask

How does a hash table handle collisions?

An alternative method for handling the collision problem is to allow each slot to hold a reference to a collection (or chain) of items. Chaining allows many items to exist at the same location in the hash table. When collisions happen, the item is still placed in the proper slot of the hash table.

Can hash tables have collisions?

A collision occurs when two keys are hashed to the same index in a hash table. Collisions are a problem because every slot in a hash table is supposed to store a single element.

What happens when hashes collide?

In computer science, a hash collision or clash is when two pieces of data in a hash table share the same hash value. The hash value in this case is derived from a hash function which takes a data input and returns a fixed length of bits.


2 Answers

Hash tables deal with collisions in one of two ways.

Option 1: By having each bucket contain a linked list of elements that are hashed to that bucket. This is why a bad hash function can make lookups in hash tables very slow.

Option 2: If the hash table entries are all full then the hash table can increase the number of buckets that it has and then redistribute all the elements in the table. The hash function returns an integer and the hash table has to take the result of the hash function and mod it against the size of the table that way it can be sure it will get to bucket. So by increasing the size, it will rehash and run the modulo calculations which if you are lucky might send the objects to different buckets.

Java uses both option 1 and 2 in its hash table implementations.

like image 184
ams Avatar answered Nov 01 '22 20:11

ams


When you talked about "Hash Table will place a new entry into the 'next available' bucket if the new Key entry collides with another.", you are talking about the Open addressing strategy of Collision resolution of hash table.


There are several strategies for hash table to resolve collision.

First kind of big method require that the keys (or pointers to them) be stored in the table, together with the associated values, which further includes:

  • Separate chaining

enter image description here

  • Open addressing

enter image description here

  • Coalesced hashing
  • Cuckoo hashing
  • Robin Hood hashing
  • 2-choice hashing
  • Hopscotch hashing

Another important method to handle collision is by Dynamic resizing, which further has several ways:

  • Resizing by copying all entries
  • Incremental resizing
  • Monotonic keys

EDIT: the above are borrowed from wiki_hash_table, where you should go to have a look to get more info.

like image 24
herohuyongtao Avatar answered Nov 01 '22 18:11

herohuyongtao