I'm learning data structures and every source tells me not to use index 0 of the array while implementing heap, without giving any explanation why. I searched the web, searched StackExchange, and couldn't find an answer.
Why Array? Since a Binary Heap is a Complete Binary Tree, it can be easily represented as an array and array-based representation is space-efficient. Level Order Traversal of the heap will give the order in which elements are filled in the array.
Heaps are commonly implemented with an array. Any binary tree can be stored in an array, but because a binary heap is always a complete binary tree, it can be stored compactly. No space is required for pointers; instead, the parent and children of each node can be found by arithmetic on array indices.
A binary heap must always be what is called maximally balanced (also sometimes called complete). A maximally balanced tree: Has all available positions for nodes filled, except for possibly the last row, which must be filled left-to-right.
So best case is OMEGA(1). However if the new element we insert is 100 then we would have to call the max-heapify function which has running time of O(log n) and therefore in the worst case inserting a new element in the heap takes O(log n).
There's no reason why a heap implemented in an array has to leave the item at index 0 unused. If you put the root at 0, then the item at array[index]
has its children at array[index*2+1]
and array[index*2+2]
. The node at array[child]
has its parent at array[(child-1)/2]
.
Let's see.
root at 0 root at 1 Left child index*2 + 1 index*2 Right child index*2 + 2 index*2 + 1 Parent (index-1)/2 index/2
So having the root at 0 rather than at 1 costs you an extra add to find the left child, and an extra subtraction to find the parent.
For a more general case where it may not be a binary heap, but a 3-heap, 4-heap, etc where there are NUM_CHILDREN children for each node instead of 2 the formulas are:
root at 0 root at 1 Left child index*NUM_CHILDREN + 1 index*NUM_CHILDREN Right child index* NUM_CHILDREN + 2 index*NUM_CHILDREN + 1 Parent (index-1)/NUM_CHILDREN index/NUM_CHILDREN
I can't see those few extra instructions making much of a difference in the run time.
For reasons why I think it's wrong to start at 1 in a language that has 0-based arrays, see https://stackoverflow.com/a/49806133/56778 and my blog post But that's the way we've always done it!
As I found it in CLRS book, there is some significance in terms of performance, since generally, shift operators work very fast.
On most computers, the LEFT procedure can compute
2*i
in one instruction by simply shifting the binary representation of i left by one bit position. Similarly, the RIGHT procedure can quickly compute2*i+1
by shifting the binary representation of i left by one bit position and then adding in a 1 as the low-order bit. The PARENT procedure can computei/2
by shifting i right one bit position.
So, starting the heap at index 1 will probably make faster calculation of parent, left and right child indexes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With