Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

compare Hash with Binary search tree

We all know that a hash table has O(1) time for both inserts and look-ups if the hash function was well chosen. So, what are the reason we want to use Binary Search Tree? Just because a perfect hash function was hard to design?

Here how I come up with this question? I notice that Standard C++ STL has set and map which are implemented with Binary search tree ,but has no hash (not talking about non-stardard hash_set , hash_map). While, Ruby has only Hash. I want to understand the rational behind this difference.

like image 837
pierrotlefou Avatar asked Oct 13 '09 09:10

pierrotlefou


People also ask

What is the difference between hash table and binary search tree?

Binary Search Trees are generally memory-efficient since they do not reserve more memory than they need to. On the other hand, Hash tables can be a bit more demanding if we don't know the exact number of elements we want to store.

Is hashing better than binary search?

Time Complexity Comparison Therefore, the performance is the same. However, in the average case scenario, hash lookup is significantly faster than binary search. In real applications, we mainly consider an average case scenario in order to test and compare the performance of different methods.

Is a binary search tree faster than a hash table?

Hashing. For implementing associative arrays, hash tables, a data structure that maps keys to records using a hash function, are generally faster than binary search on a sorted array of records. Most hash table implementations require only amortized constant time on average.

What is the advantage of hash table over BST?

What is the advantage of a hash table over BST? Explanation: Hash table and BST both are examples of data structures. Hash table has an advantage that it has a better time complexity for performing insert, delete and search operations.


2 Answers

Trees allow in-order traversion.

The worst case performance for a hash table is O(N) (linear search through one bucket), a binary search is bound by O(log N).

NB: this requires the tree to be balanced - that's why typical implementation use a self-balancing tree, suhc as a red-black tree.

While such a degradation is unlikely, it is not impossible and depends strongly on the ability to chose an appropriate hash function and the distribution of the actual data.

A tree implementation also grows trivially to the required size, whereas a hashmap starts to degrade when it gets full (for most implementations, it's said around 70% of the buckets filled). You either need to rehash the entire table (again, bad fo real time apps), or incrementally move to a new table, which is not a simple implementation.

In the end, STL probably just went with one "base" container template, the tree, to avoid the additional implementation complexity.

like image 78
peterchen Avatar answered Oct 27 '22 14:10

peterchen


To add on peterchen answer, hash structures although theoretically faster at insertion and removal depend vastly on the actual data, the chosen hash function and the amount of data.

  • A perfect hash function depends on the amount and distribution of the data.

Having large performance variations between best and worst cases makes them unfit for general purpose structures. Binary trees on the other hand are more predictable independently of the amount/type of data used, even though less efficient on best case scenario.

like image 37
Vasco Fernandes Avatar answered Oct 27 '22 14:10

Vasco Fernandes