We all know that a hash table has O(1) time for both inserts and look-ups if the hash function was well chosen. So, what are the reason we want to use Binary Search Tree? Just because a perfect hash function was hard to design? Here how I come up with this question? I notice that Standard C++ STL has <code>set</code> and <code>map</code> which are implemented with Binary search tree ,but has no hash (not talking about non-stardard <code>hash_set</code> , <code>hash_map</code>). While, Ruby has only <code>Hash</code>. I want to understand the rational behind this difference.

Trees allow in-order traversion. The worst case performance for a hash table is O(N) (linear search through one bucket), a binary search is bound by O(log N). <blockquote class="spoiler"> NB: this requires the tree to be balanced - that's why typical implementation use a self-balancing tree, suhc as a red-black tree. </blockquote> While such a degradation is unlikely, it is not impossible and depends strongly on the ability to chose an appropriate hash function and the distribution of the actual data. A tree implementation also grows trivially to the required size, whereas a hashmap starts to degrade when it gets full (for most implementations, it's said around 70% of the buckets filled). You either need to rehash the entire table (again, bad fo real time apps), or incrementally move to a new table, which is not a simple implementation. In the end, STL probably just went with one "base" container template, the tree, to avoid the additional implementation complexity.

To add on peterchen answer, hash structures although theoretically faster at insertion and removal depend vastly on the actual data, the chosen hash function and the amount of data. <ul> <li>A perfect hash function depends on the amount and distribution of the data.</li> </ul> Having large performance variations between best and worst cases makes them unfit for general purpose structures. Binary trees on the other hand are more predictable independently of the amount/type of data used, even though less efficient on best case scenario.

compare Hash with Binary search tree

Tags:

hash

binary-tree

We all know that a hash table has O(1) time for both inserts and look-ups if the hash function was well chosen. So, what are the reason we want to use Binary Search Tree? Just because a perfect hash function was hard to design?

Here how I come up with this question? I notice that Standard C++ STL has set and map which are implemented with Binary search tree ,but has no hash (not talking about non-stardard hash_set , hash_map). While, Ruby has only Hash. I want to understand the rational behind this difference.

837

asked Oct 13 '09 09:10

pierrotlefou

2 Answers

Trees allow in-order traversion.

The worst case performance for a hash table is O(N) (linear search through one bucket), a binary search is bound by O(log N).

NB: this requires the tree to be balanced - that's why typical implementation use a self-balancing tree, suhc as a red-black tree.

While such a degradation is unlikely, it is not impossible and depends strongly on the ability to chose an appropriate hash function and the distribution of the actual data.

A tree implementation also grows trivially to the required size, whereas a hashmap starts to degrade when it gets full (for most implementations, it's said around 70% of the buckets filled). You either need to rehash the entire table (again, bad fo real time apps), or incrementally move to a new table, which is not a simple implementation.

In the end, STL probably just went with one "base" container template, the tree, to avoid the additional implementation complexity.

answered Oct 27 '22 14:10

peterchen

To add on peterchen answer, hash structures although theoretically faster at insertion and removal depend vastly on the actual data, the chosen hash function and the amount of data.

A perfect hash function depends on the amount and distribution of the data.

Having large performance variations between best and worst cases makes them unfit for general purpose structures. Binary trees on the other hand are more predictable independently of the amount/type of data used, even though less efficient on best case scenario.

answered Oct 27 '22 14:10

Vasco Fernandes

Related questions
                            
                                how do you get the password hash of a zip file?
                            
                                Pros and cons of using MD5 Hash as the primary key vs. use a int identity as the primary key in SQL Server
                            
                                O(1) hash look ups?
                            
                                Python 256bit Hash function with number output
                            
                                How to merge two hashes with no new keys
                            
                                Printing Perl Hash Keys
                            
                                How are arrays and hash maps constant time in their access?
                            
                                Will string.GetHashCode() return negative value?
                            
                                Ruby method Array#<< not updating the array in hash
                            
                                php hash form string to integer
                            
                                Convert 32-char md5 string to integer
                            
                                Ruby: merge two hash as one and with value connected
                            
                                Need to generate HMAC SHA256 hash in Objective C as in Java
                            
                                Get Multiple Values From Hash In Very Efficient Way
                            
                                What comes first, the salt or the hash?
                            
                                Is the .NET string hash function portable? [duplicate]
                            
                                Generate same unique hash code for all anagrams
                            
                                Ruby on Rails: hash.each {} issues
                            
                                Calculate hash when writing to stream
                            
                                Is it better to check Perl hash keys for truth or for existence?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With