Binary heaps vs d-ary heaps

Tags:

I've read that binary heaps are faster at delete minimum operations and d-ary heaps are faster at at decrease priority operations (although I don't get why), but then I've also read that a 4-heap is faster at both of them compared to a binary heap.

So when do I use a binary heap and when do I use a d-ary heap? And how do I decide what the d should be for the d-ary heap?

350

asked Mar 18 '15 15:03

Just some guy

1 Answers

There are a few different factors at play here that, I believe, make it possible for all of the statements you've made to be true.

To see why this is, let's start by thinking about how a decrease-key in a d-ary heap works (we don't need to talk about binary heaps separately, since a binary heap is just a 2-ary heap). When performing a decrease-key, we change the priority of a node in the tree, then repeatedly swap it up with its parent until it either hits the root of the tree or its priority ends up becoming smaller than the priority of its parent. The number of times we have to do a swap is, in the worst case, given by the height of the d-ary heap. Since the number of nodes in each layer of a d-ary heap grows exponentially by a factor of d at each step, the height of a d-ary heap is O(log_d n) = O(log n / log d). This means that if you increase the value of d, the height of the d-ary heap will decrease, so decrease-keys and insertions will take less time. If you think about an extreme case, if you have a 10¹⁰⁰-ary heap, the number of layers in the tree will be about 100 times fewer than in a binary heap, so a decrease-key or insert will be about 100 times faster.

On the other hand, think about how a dequeue operation would work. To perform a dequeue, we swap the last leaf for the root, then repeatedly do the following: we scan across all the children of the current node, and if any of them are smaller than the current node, we swap the current node with the smallest of its children. Each of these iterations will require O(d) total comparisons to find the smallest child, and the number of iterations is given by the number of layers in the tree, which we've seen earlier is O(log n / log d). This means that the cost of a dequeue in a d-ary heap is O(d log n / log d). Since d grows much faster than log d (exponentially faster, in fact), as we increase d, the asymptotic - and actual - cost of a dequeue starts to rise. For example, in a 10¹⁰⁰-ary heap, you might have to compare each node to 10¹⁰⁰ children at each step, which is going to take a really long time! Therefore, d-ary heaps, as d gets larger and larger, tend to have much slower dequeues than binary heaps.

Now on to your last question: how is it still possible that a 4-ary heap would outperform a binary heap given the information here? I'm going to be perfectly honest and say that I have no idea if this is true, but that it (a) probably depends on the hardware and (b) wouldn't surprise me. Keep in mind that all of the preceding analyses tried to bound the cost of the d-ary heap operations by looking at quantities like the number of layers in the heap and the number of swaps made. However, this leaves out a lot of other factors, such as the cost of finding parents and children and locality of reference. For the first of these, note that in a d-ary heap, you can find your parent node by dividing your index by d. For d's that are perfect powers of two, this can be implemented with a simple, inexpensive bitshift (since n / 2^k = n >> k). For odd numbers or numbers that aren't powers of two, this requires a division, which (in some architectures) is more expensive than a bit shift. Additionally, there's the impact of locality of reference. Computers these days have a huge number of layers of caches in memory, and the cost of accessing memory that is in cache can be hundreds or thousands of times faster than the cost of accessing memory not in the cache. As you increase the value of d in a d-ary heap, there are fewer layers in the tree and the elements accessed are closer together, giving better locality. Finding the sweet spot probably requires some experimentation, and if it happens to be that d = 4 is the best on your machine, then go for it!

EDIT: as @moreON pointed out, for d = 4, the number of layers in the heap goes down by a factor of two and the number of comparisons per later goes up by a factor of two, which may actually give better overall performance due to cache effects and a lower overall tree height. Therefore, it's probably a good candidate to outperform a binary heap.

Hope this helps!

answered Oct 17 '22 04:10

templatetypedef

Related questions
                            
                                Find the most frequent triplet in an array
                            
                                Sorting integers with a binary trie?
                            
                                Most efficient way to produce a Python dictionary given a list
                            
                                Sum of K elements in array that equates to N
                            
                                Counting total number of points inside 2 circles
                            
                                64bit array operation by C/C++
                            
                                Is it possible to have multiple valid BSTs for a given set of data?
                            
                                C# alternative to tuple
                            
                                You are given a set of intervals S. You have to find all intervals in S that are contained in a given interval (a, b) in minimum time complexity
                            
                                Finding the minimum element in a given range greater than a given number
                            
                                Haskell: How to use this data structure
                            
                                What is band storage of matrix?
                            
                                How to find words from scrambled words
                            
                                Hibernate how to implement dynamic data structure
                            
                                Perfect List Structure?
                            
                                data structure for sorted browsing history
                            
                                C++ Priority Queue - Reorder based on updated priorities
                            
                                Shortest path in a complement graph algorithm
                            
                                How to modify a 2D array from multiple threads without synchronization
                            
                                How to decide order of a B-tree

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Binary heaps vs d-ary heaps

Tags:

data-structures

heap

binary-heap

Just some guy

People also ask

1 Answers

templatetypedef

Recent Activity

Donate For Us