Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Binomial Heaps: proof that merge runs in O(log n) time

I am reading through Okasaki's Purely Functional Data Structures and am trying to do some of the exercises. One of them is to prove that binomial heap merge takes O(log n) time where n is the number of nodes in the heap.

functor BinomialHeap (Element:ORDERED):HEAP=
struct
  structure Elem=Element
  datatype Tree = Node of int*Elem.T*Tree list
  type Heap = Tree list
  fun link (t1 as Node (r,x1,c1), t2 as Node (_,x2,c2))=
    if Elem.leq(x1,x2)
    then Node (r+1,x1,t2::c1)
    else Node (r+1,x2,t1::c2)
  fun insTree (t,[])=[t]
     |insTree (t,ts as t'::ts')=
        if rank t < rank t' then t::ts else insTree(link(t,t'),ts')
  fun insert (x,ts)=insTree(Node(0,x,[]),ts) (*just for reference*)
  fun merge (ts1,[])=ts1
     |merge ([],ts2)=ts2
     |merge (ts1 as t1::ts1', ts2 as t2:ts2')=
        if rank t1 < rank t2 then t1::merge(ts1',ts2)
        else if rank t2 < rank t1 then t2::merge(ts1,ts2')
        else insTree (link(t1,t2), merge (ts1',ts2'))
end

It is clear that merge will call itself max(numNodes ts1, numNodes ts2) times, but since insTree is O(log n) worst case, can you explain how merge is O(log n)?

like image 400
Justin Raymond Avatar asked Sep 27 '22 19:09

Justin Raymond


1 Answers

First note that merge will be called at most (numNodes ts1 + numNodes ts2) times, and this is O(log n) times. (Just to be clear, ts1 and ts2 are lists of binomial trees, where such a tree of rank r contains exactly 2^r nodes, and each rank can occur at most once. Therefore, there are O(log n1) such trees in ts1 and O(log n2) in ts2, where n1 and n2 are the number of nodes in the heaps and n=n1+n2.)

The key point to notice is that insTree is called at most once for each rank (either through merge or recursively), and the largest possible rank is log_2(n). The reason is the following:

If insTree is called from merge, then say r = rank t1 = rank t2, and link(t1,t2) will have rank r+1. So let's say insTree is called for rank r+1. Now think about what happens with merge(ts1', ts2'). Let the smallest rank that occurs as a tree in ts1' and ts2' be r' >= r+1. Then insTree will be called again from merge for rank r'+1, since the two trees of rank r' will get linked to form a tree of rank r'+1. However, the merged heap merge(ts1', ts2') can therefore not contain a tree of rank r', and the previous call to insTree can therefore not recurse further than r'.

So putting things together:

  • insTree is called at most O(log n) times, with each call being constant time (since we count the recursion as a separate call)

  • merge is called at most O(log n) times, with each call being constant time (since we count the calls to insTree separately and link is constant time)

=> The entire merge operation is O(log n).

EDIT: By the way, merging binomial heaps is very much like adding binary numbers. A heap of size n will have a tree of rank r if and only if the binary number n has a '1' at the 2^r position. When merging such heaps, you proceed from the lowest rank to highest rank -- or least significant to most significant position. Trees of the same rank need to be linked (the 'ones' added), and inserted / "carried" into the higher rank positions.

like image 136
m7thon Avatar answered Oct 29 '22 23:10

m7thon