We all know that a hash table has O(1) time for both inserts and look-ups if the hash function was well chosen. So, what are the reason we want to use Binary Search Tree? Just because a perfect hash function was hard to design?
Here how I come up with this question?
I notice that Standard C++ STL has set
and map
which are implemented with Binary search tree ,but has no hash (not talking about non-stardard hash_set
, hash_map
). While, Ruby has only Hash
. I want to understand the rational behind this difference.
Binary Search Trees are generally memory-efficient since they do not reserve more memory than they need to. On the other hand, Hash tables can be a bit more demanding if we don't know the exact number of elements we want to store.
Time Complexity Comparison Therefore, the performance is the same. However, in the average case scenario, hash lookup is significantly faster than binary search. In real applications, we mainly consider an average case scenario in order to test and compare the performance of different methods.
Hashing. For implementing associative arrays, hash tables, a data structure that maps keys to records using a hash function, are generally faster than binary search on a sorted array of records. Most hash table implementations require only amortized constant time on average.
What is the advantage of a hash table over BST? Explanation: Hash table and BST both are examples of data structures. Hash table has an advantage that it has a better time complexity for performing insert, delete and search operations.
Trees allow in-order traversion.
The worst case performance for a hash table is O(N) (linear search through one bucket), a binary search is bound by O(log N).
NB: this requires the tree to be balanced - that's why typical implementation use a self-balancing tree, suhc as a red-black tree.
While such a degradation is unlikely, it is not impossible and depends strongly on the ability to chose an appropriate hash function and the distribution of the actual data.
A tree implementation also grows trivially to the required size, whereas a hashmap starts to degrade when it gets full (for most implementations, it's said around 70% of the buckets filled). You either need to rehash the entire table (again, bad fo real time apps), or incrementally move to a new table, which is not a simple implementation.
In the end, STL probably just went with one "base" container template, the tree, to avoid the additional implementation complexity.
To add on peterchen answer, hash structures although theoretically faster at insertion and removal depend vastly on the actual data, the chosen hash function and the amount of data.
Having large performance variations between best and worst cases makes them unfit for general purpose structures. Binary trees on the other hand are more predictable independently of the amount/type of data used, even though less efficient on best case scenario.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With