Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find swapped nodes in a BST

I am trying to write a program that can detect and print two nodes in BST that have been swapped.

In a three level tree, I reached near to the solution using this approach.

If (!AllSubTreeAreValid())
{
//Nodes swapped on same side of main root node
}
else
{
  int max = getMax(root->left);
  int min = getMin(root->right);
  if(max > root->data || min < root->data )
  {
     //Nodes swapped on different sides of main root node
     //Print max and min values

  }
else 
{
 //No node swappped
}
}

//Helper functions
int GetMaxEle(TreeNode* tree)
{
    while(tree->right!=NULL)
    {
        tree=tree->right;
    }
    return tree->info;
}

int GetMinEle(TreeNode* tree)
{
    while(tree->left!=NULL)
    {
        tree=tree->left;
    }
    return tree->info;
}

but the above approach failed when I tried to test with four level tree.

             20

      15            30

   10    17       25     33

9  16  12  18  22  26  31  34

Being in root node 15's right subtree, 12 is still greater (violation).

Being in root node 15's left subtree, 16 is still greater (violation).

So, 16, 12 are the swapped elements in the above BST. How do I find them through the program?

like image 513
Cipher Avatar asked Aug 31 '11 16:08

Cipher


1 Answers

One way to think about this problem is to use the fact that an inorder walk of the tree will produce all of the elements in sorted order. If you can detect deviations from the sorted order during this walk, you can try to locate the two elements that are in the wrong place.

Let's see how to do this for a simple sorted array first, then will use our algorithm to build something that works on trees. Intuitively, if we start off with a sorted array and then swap two (non-equal!) elements, we will end up with some number of elements in the array being out of place. For example, given the array

1 2 3 4 5

If we swap 2 and 4, we end up with this array:

1 4 3 2 5

How would we detect that 2 and 4 were swapped here? Well, since 4 is the greater of the two elements and was swapped downward, it should be greater than both of the elements around it. Similarly, because 2 was swapped up, it should be smaller than both of the elements around it. From this, we could conclude that 2 and 4 were swapped.

However, this doesn't always work correctly. For example, suppose that we swap 1 and 4:

4 2 3 1 5

Here, both 2 and 1 are smaller than their neighboring elements, and both 4 and 3 are larger than theirs. From this we can tell that two of these four somehow were swapped, but it's not clear which ones we should interchange. However, if we take the largest and the smallest of these values (1 and 4, respectively), we end up getting the pair that was swapped.

More generally, to find the elements that were swapped in the sequence, you want to find

  • The greatest local maximum in the array.
  • The smallest local minimum in the array.

These two elements are out of place and should be swapped.

Now, let's think about how to apply this to trees. Since an inorder walk of the tree will produce the sorted sequence with the two elements out of order, one option would be to walk the tree, recording the inorder sequence of elements we've found, then using the above algorithm. For example, consider your original BST:

              20
         /         \
      15             30
     /   \         /   \ 
   10    17      25     33
  / |   /  \    /  \    |  \
9  16  12  18  22  26  31  34

If we linearize this into an array, we get

9 10 16 15 12 17 18 20 22 25 26 30 31 33 34

Notice that 16 is greater than its surrounding elements and that 12 is less than its. This immediately tells us that 12 and 16 were swapped.

A simple algorithm for solving this problem, therefore, would be to do an inorder walk of the tree to linearize it into a sequence like a vector or deque, then to scan that sequence to find the largest local maximum and the smallest local minimum. This would run in O(n) time, using O(n) space. A trickier but more space-efficient algorithm would be to only keep track of three nodes at a time - the current node, its predecessor, and its successor - which reduces the memory usage to O(1).

Hope this helps!

like image 196
templatetypedef Avatar answered Sep 28 '22 08:09

templatetypedef