Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find whether given sum exists over a path in a BST

The question is to find whether a given sum exists over any path in a BST. The question is damn easy if a path means root to leaf, or easy if the path means a portion of a path from root to leaf that may not include the root or the leaf. But it becomes difficult here, because a path may span both left and right child of a node. For example, in the given figure, a sum of 132 exists over the circled path. How can I find the existence of such a path? Using hash to store all possible sums under a node is frowned upon!

enter image description here

like image 498
SexyBeast Avatar asked Oct 27 '12 22:10

SexyBeast


People also ask

How do you find the path sum?

Start from the root node of the Binary tree with the initial path sum of 0. Add the value of the current node to the path sum. Travel to the left and right child of the current node with the present value of the path sum.

What is path sum in binary tree?

The path sum of a path is the sum of the node's values in the path. Given the root of a binary tree, return the maximum path sum of any non-empty path. Example 1: Input: root = [1,2,3] Output: 6 Explanation: The optimal path is 2 -> 1 -> 3 with a path sum of 2 + 1 + 3 = 6.

What is path sum?

Sum of all the numbers that are formed from root to leaf paths.


1 Answers

You can certainly generate all possible paths, summing incrementally as you go. The fact that the tree is a BST might let you save time by bounding out certain sums, though I'm not sure that will give an asymptotic speed increase. The problem is that a sum formed using the left child of a given node will not necessarily be less than a sum formed using the right child, since the path for the former sum could contain many more nodes. The following algorithm will work for all trees, not just BSTs.

To generate all possible paths, notice that the topmost point of a path is special: it's the only point in a path which is allowed (though not required) to have both children contained in the path. Every path contains a unique topmost point. Therefore the outer layer of recursion should be to visit every tree node, and to generate all paths that have that node as the topmost point.

// Report whether any path whose topmost node is t sums to target.
// Recurses to examine every node under t.
EnumerateTopmost(Tree t, int target) {
    // Get a list of sums for paths containing the left child.
    // Include a 0 at the start to account for a "zero-length path" that
    // does not contain any children.  This will be in increasing order.
    a = append(0, EnumerateSums(t.left))
    // Do the same for paths containing the right child.  This needs to
    // be sorted in decreasing order.
    b = reverse(append(0, EnumerateSums(t.right)))

    // "List match" to detect any pair of sums that works.
    // This is a linear-time algorithm that takes two sorted lists --
    // one increasing, the other decreasing -- and detects whether there is
    // any pair of elements (one from the first list, the other from the
    // second) that sum to a given value.  Starting at the beginning of
    // each list, we compute the current sum, and proceed to strike out any
    // elements that we know cannot be part of a satisfying pair.
    // If the sum of a[i] and b[j] is too small, then we know that a[i]
    // cannot be part of any satisfying pair, since all remaining elements
    // from b that it could be added to are at least as small as b[j], so we
    // can strike it out (which we do by advancing i by 1).  Similarly if
    // the sum of a[i] and b[j] is too big, then we know that b[j] cannot
    // be part of any satisfying pair, since all remaining elements from a
    // that b[j] could be added to are at least as big as a[i], so we can
    // strike it out (which we do by advancing j by 1).  If we get to the
    // end of either list without finding the right sum, there can be
    // no satisfying pair.
    i = 0
    j = 0
    while (i < length(a) and j < length(b)) {
        if (a[i] + b[j] + t.value < target) {
            i = i + 1
        } else if (a[i] + b[j] + t.value > target) {
            j = j + 1
        } else {
            print "Found!  Topmost node=", t
            return
        }
    }

    // Recurse to examine the rest of the tree.
    EnumerateTopmost(t.left)
    EnumerateTopmost(t.right)
}

// Return a list of all sums that contain t and at most one of its children,
// in increasing order.
EnumerateSums(Tree t) {
    If (t == NULL) {
        // We have been called with the "child" of a leaf node.
        return []     // Empty list
    } else {
        // Include a 0 in one of the child sum lists to stand for
        // "just node t" (arbitrarily picking left here).
        // Note that even if t is a leaf node, we still call ourselves on
        // its "children" here -- in C/C++, a special "NULL" value represents
        // these nonexistent children.
        a = append(0, EnumerateSums(t.left))
        b = EnumerateSums(t.right)
        Add t.value to each element in a
        Add t.value to each element in b
        // "Ordinary" list merge that simply combines two sorted lists
        // to produce a new sorted list, in linear time.
        c = ListMerge(a, b)
        return c
    }
}

The above pseudocode only reports the topmost node in the path. The entire path can be reconstructed by having EnumerateSums() return a list of pairs (sum, goesLeft) instead of a plain list of sums, where goesLeft is a boolean that indicates whether the path used to generate that sum initially goes left from the parent node.

The above pseudocode calculates sum lists multiple times for each node: EnumerateSums(t) will be called once for each node above t in the tree, in addition to being called for t itself. It would be possible to make EnumerateSums() memoise the list of sums for each node so that it's not recomputed on subsequent calls, but actually this doesn't improve the asymptotics: only O(n) work is required to produce a list of n sums using the plain recursion, and changing this to O(1) doesn't change the overall time complexity because the entire list of sums produced by any call to EnumerateSums() must in general be read by the caller anyway, and this requires O(n) time. EDIT: As pointed out by Evgeny Kluev, EnumerateSums() actually behaves like a merge sort, being O(nlog n) when the tree is perfectly balanced and O(n^2) when it is a single path. So memoisation will in fact give an asymptotic performance improvement.

It is possible to get rid of the temporary lists of sums by rearranging EnumerateSums() into an iterator-like object that performs the list merge lazily, and can be queried to retrieve the next sum in increasing order. This would entail also creating an EnumerateSumsDown() that does the same thing but retrieves sums in decreasing order, and using this in place of reverse(append(0, EnumerateSums(t.right))). Doing this brings the space complexity of the algorithm down to O(n), where n is the number of nodes in the tree, since each iterator object requires constant space (pointers to left and right child iterator objects, plus a place to record the last sum) and there can be at most one per tree node.

like image 167
j_random_hacker Avatar answered Oct 27 '22 13:10

j_random_hacker