Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash table - why is it faster than arrays?

In cases where I have a key for each element and I don't know the index of the element into an array, hashtables perform better than arrays (O(1) vs O(n)).

Why is that? I mean: I have a key, I hash it.. I have the hash.. shouldn't the algorithm compare this hash against every element's hash? I think there's some trick behind the memory disposition, isn't it?

like image 308
Johnny Pauling Avatar asked Aug 18 '12 18:08

Johnny Pauling


People also ask

Which operation is faster in hashing than array?

In cases where I have a key for each element and I don't know the index of the element into an array, hashtables perform better than arrays (O(1) vs O(n)). The hash table search performs O(1) in the average case.

Why do we use hashing and not arrays?

You use a hash table when you don't have an index, because you're using something like a string as the key. Also, if the indexes in use are distributed sparsely throughout the array, allocating an array with 20 billion elements is a big waste of memory.

Why are hash maps so fast?

Hashmaps use the hashcode of the key to access directly the bucket where the entry is stored. This is an O(1) access. If more than one element is in that bucket because of the same or similar hashcode, then you have a few more checks, but it's still way faster than iterating through a list and searching for an element.

What is the advantage of using hash table?

The main advantage of hash tables over other data structures is speed . The access time of an element is on average O(1), therefore lookup could be performed very fast. Hash tables are particularly efficient when the maximum number of entries can be predicted in advance.


1 Answers

In cases where I have a key for each element and I don't know the index of the element into an array, hashtables perform better than arrays (O(1) vs O(n)).

The hash table search performs O(1) in the average case. In the worst case, the hash table search performs O(n): when you have collisions and the hash function always returns the same slot. One may think "this is a remote situation," but a good analysis should consider it. In this case you should iterate through all the elements like in an array or linked lists (O(n)).

Why is that? I mean: I have a key, I hash it.. I have the hash.. shouldn't the algorithm compare this hash against every element's hash? I think there's some trick behind the memory disposition, isn't it?

You have a key, You hash it.. you have the hash: the index of the hash table where the element is present (if it has been located before). At this point you can access the hash table record in O(1). If the load factor is small, it's unlikely to see more than one element there. So, the first element you see should be the element you are looking for. Otherwise, if you have more than one element you must compare the elements you will find in the position with the element you are looking for. In this case you have O(1) + O(number_of_elements).

In the average case, the hash table search complexity is O(1) + O(load_factor) = O(1 + load_factor).

Remember, load_factor = n in the worst case. So, the search complexity is O(n) in the worst case.

I don't know what you mean with "trick behind the memory disposition". Under some points of view, the hash table (with its structure and collisions resolution by chaining) can be considered a "smart trick".

Of course, the hash table analysis results can be proven by math.

like image 178
bitfox Avatar answered Sep 22 '22 16:09

bitfox