Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Argument for O(1) average-case complexity of heap insertion

The claim on the Wikipedia page for binary heaps is that insertion is O(log n) in the worst case, but O(1) on average:

The number of operations required depends only on the number of levels the new element must rise to satisfy the heap property, thus the insertion operation has a worst-case time complexity of O(log n) but an average-case complexity of O(1).

The linked page attempts to justify this as follows:

However, on average, the newly inserted element does not travel very far up the tree. In particular, assuming a uniform distribution of keys, it has a one-half chance of being greater than its parent; it has a one-half chance of being greater than its grandparent given that it is greater than its parent; it has a one-half chance of being greater than its great-grandparent given that it is greater than its parent, and so on [...] so that in the average case insertion takes constant time

This is surely nonsense, though? It seems to me that if the tree were randomly ordered then there would be a 50/50 chance that a new element was greater than its parent; but that since, roughly speaking, the large elements sink to the bottom, the chances are much less than 50/50 as the heap grows.

Is that right?

It's been like that on Wikipedia for several months...

like image 464
chiastic-security Avatar asked Sep 15 '16 15:09

chiastic-security


People also ask

What is the average time complexity of inserting data into a heap is?

For a random heap, and for repeated insertions, the insertion operation has an average-case complexity of O(1).

What is the complex complexity of adding an element to the heap?

B) O(n*logn) Explanation: The time it takes to add a single element to the heap would be N times the complexity. And because adding a single element takes logN time, N*logN is the answer.

Why is heap insertion O log n?

Building a heap by repeated insertion is O(n log n). This is O(log n) because moving the node down the heap can potentially require O(log n) swaps.

What is the best case complexity in building a heap?

What is the best case complexity in building a heap? Explanation: The best case complexity occurs in bottom-up construction when we have a sortes array given.


1 Answers

There is a much better reference for the claim that average time heap insertion is O(1): the 1991 paper "Average Case Analysis of Heap Building by Repeated Insertion" by Hayward & McDiarmid. (This paper is linked in what is currently reference 4 of the Wikipedia article.) That paper in turn references a 1975 paper by Porter & Simon, "Random insertion into a priority queue structure" which deals with a single insertion into a heap, and demonstrates that the average case is O(1).

Intuitively, the argument is simple. Half of the heap is a leaf, and the leaves tend to be larger. If we assume for a moment that leaves are the largest elements in the heap (rather than tending to be larger), then we could say that the probability that a new element will be a leaf -- i.e., that it is in the upper half of the range of values -- would be exactly 0.5. If the new element is not a leaf of the heap (also probability 0.5), we can repeat the process with the truncated heap consisting only of non-leaf nodes in the original heap, so the probability that the new element is at the second-lowest level will be half of what remains: 0.25. And thus the probability that it is at the third level would be 0.125, and so on. Then the expected number of levels we would have to search through would be 1*0.5 + 2*0.25 + 3*0.125..., which is 2.

Of course, the probability that the random new element is greater than a random second-level parent is not really 0.5; it's actually a little less. But, as long as it is bounded by a constant, the sum of the power series which computes the expected number of comparisons will still also be bounded by a constant. And it turns out that the constant is around 2.6.

Also see this useful answer that, while discussing complexities of heaps, and contrasting them with those of BST, gives a detailed graphical analysis of the constant average insertion time in heaps.

like image 174
rici Avatar answered Oct 06 '22 03:10

rici