This is Exercise 15.5-4 of Introduction to Algorithms, 3rd edition, which is about Knuth's improvement to the DP approach to Optimal Binary Search Tree.
The DP algorithm of Optimal Binary Search Tree is:
OPTIMAL_BST(p, q, n)
let e[1..n+1, 0..n], w[1..n+1, 0..n], and root[1..n, 1..n] be new tables
for i = 1 to n+1
e[i, i - 1] = q[i - 1];
w[i, i - 1] = q[i - 1];
for l = 1 to n
for i = 1 to n - l + 1
j = i + l - 1
e[i, j] = INFINITY
w[i, j] = w[i, j - 1] + p[j] + q[j]
for r = i to j
t = e[i, r - 1] + e[r + 1, j] + w[i, j]
if t < e[i, j]
e[i, j] = t
root[i, j] = r
return e and root
The complexity is O(n3).
Knuth had observed that root[i, j - 1] <= root[i, j] <= root[i + 1, j]
, so Exercise 15.5-4 asks to implement an O(n2) algorithm by doing some modification to the original algorithm.
Well after some effort I have figured this out: in the innermost loop, replace the line
for r = i to j
with
for r = r[i, j - 1] to r[i + 1, j]
This has been proved by this link: Optimal binary search trees
However, I'm not sure this is really O(n2): since during each innermost loop, distance from r[i, j - 1] to r[i + 1, j] is not constant, I suspect it is still O(n3).
So my question is: can you please explain to me why the improvement to DP algorithm yields O(n2) complexity?
PS: Maybe I might have read Knuth's paper first, but really I searched the web but found no free access to the paper.
In computer science, an optimal binary search tree (Optimal BST), sometimes called a weight-balanced binary tree, is a binary search tree which provides the smallest possible search time (or expected search time) for a given sequence of accesses (or access probabilities).
Advantages of Binary Search Tree: BST is fast in insertion and deletion when balanced. BST is efficient. We can also do range queries – find keys between N and M (N <= M). BST code is simple as compared to other data structures.
The binary search tree is a balanced binary search tree. Height of the binary search tree becomes log(n). So, Time complexity of BST Operations = O(logn).
To find the optimal binary search tree, we will determine the frequency of searching a key. Let's assume that frequencies associated with the keys 10, 20, 30 are 3, 2, 5. The above trees have different frequencies. The tree with the lowest frequency would be considered the optimal binary search tree.
You're correct that the distance from r[i, j - 1]
to r[i + 1, j]
is not constant in the worst case, but it is constant on average, which suffices to imply a quadratic running time. The total number of iterations for l
is
S = sum_{i = 1}^{n - l + 1} (r[i + 1, j] + 1 - r[i, j - 1]), j = i + l - 1
= sum_{i = 1}^{n - l + 1} (r[i + 1, i + l - 1] + 1 - r[i, i + l - 2])
= r[n - l + 2, n] + n - l + 1 - r[1, l - 1]
therefore the average is S / (n - l + 1), which is a constant
by simplifying the telescoping sum.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With