Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The most efficient way to test two binary trees for equality

How would you implement in Java the binary tree node class and the binary tree class to support the most efficient (from run-time perspective) equal check method (also has to be implemented):

    boolean equal(Node<T> root1, Node<T> root2) {}

or

    boolean equal(Tree t1, Tree t2) {}

First, I created the Node class as follows:

    public class Node<T> {
        private Node<T> left;
        private Node<T> right;
        private T data;

        // standard getters and setters
    }

and then the equals method that takes 2 root nodes as an arguments and runs the standard recursive comparison:

    public boolean equals(Node<T> root1, Node<T> root2) {
        boolean rootEqual = false;
        boolean lEqual = false;
        boolean rEqual = false;    

        if (root1 != null && root2 != null) {
            rootEqual = root1.getData().equals(root2.getData());

            if (root1.getLeft()!=null && root2.getLeft() != null) {
                // compare the left
                lEqual = equals(root1.getLeft(), root2.getLeft());
            }
            else if (root1.getLeft() == null && root2.getLeft() == null) {
                lEqual = true;
            }
            if (root1.getRight() != null && root2.getRight() != null) {
                // compare the right
                rEqual = equals(root1.getRight(), root2.getRight());
            }
            else if (root1.getRight() == null && root2.getRight() == null) {
                rEqual = true;
            }

            return (rootEqual && lEqual && rEqual);
        }
        return false;
    } 

My second attempt was to implement the trees using arrays and indexes for traversing. Then the comparison could be done using the bitwise operations (AND) on two arrays - read chunk from 2 arrays and mask one by another using logical AND. I failed to get my code working so I do not post it here (I'd appreciate your implementation of the second idea as well as your improvements).

Any thoughts how to do equality test for binary trees most efficiently?

EDIT

The question assumes structural equality. (Not the semantic equality)

However, code that tests the semantic equality e.g. "Should we consider the two trees to be equal if their contents are identical, even if their structure is not?" Would be just iterating over the tree in-order and it should be straightforward.

like image 290
aviad Avatar asked Mar 07 '12 07:03

aviad


3 Answers

Well for one thing you're always checking the branches, even if you spot that the roots are unequal. Your code would be simpler (IMO) and more efficient if you just returned false as soon as you spotted an inequality.

Another option to simplify things is to allow your equals method to accept null values and compare two nulls as being equal. That way you can avoid all those nullity checks in the different branches. This won't make it more efficient, but it'll be simpler:

public boolean equals(Node<T> root1, Node<T> root2) {
    // Shortcut for reference equality; also handles equals(null, null)
    if (root1 == root2) {
        return true;
    }
    if (root1 == null || root2 == null) {
        return false;
    }
    return root1.getData().equals(root2.getData()) &&
           equals(root1.getLeft(), root2.getLeft()) &&
           equals(root1.getRight(), root2.getRight());
} 

Note that currently this will fail if root1.getData() returns null. (That may or may not be possible with the way you're adding nodes.)

EDIT: As discussed in comments, you could use hash codes to make a very quick "early out" - but it would add complexity.

Either you need to make your trees immutable (which is a whole other discussion) or you need each node to know about its parent, so that when the node is changed (e.g. by adding a leaf or changing the value) it needs to update its hash code and ask its parent to update too.

like image 121
Jon Skeet Avatar answered Oct 12 '22 07:10

Jon Skeet


Out of curiosity, do you consider the two trees to be equal if their contents are identical, even if their structure is not? For example, are these equal?

  B         C        C      A
 / \       / \      / \      \
A   D     B   D    A   D      B
   /     /          \          \
  C     A            B          C
                                 \
                                  D

These trees have the same contents in the same order, but because the structures are different, by your tests would not be equal.

If you want to test this equality, personally I'd just build an iterator for the tree using in-order traversal and iterate through the trees comparing them element by element.

like image 27
Hounshell Avatar answered Oct 12 '22 06:10

Hounshell


First of all I'm making a few general assumptions. These are assumptions that are valid for most tree-based collection classes but it's always worth checking:

  1. You consider two trees to be equal if and only if they are equal both in terms of tree structure and in terms of data values at each node (as defined by data.equals(...))
  2. null data values are allowed at tree nodes (this could be either because you allow null explicitly or because your data structure only stores non-null values at leaf nodes)
  3. There aren't any particular unusual facts you know about the distribution of data values that you can take advantage of (for example, if you knew that the only possible data vales were null or the String "foo", then you don't need to compare two non-null String values)
  4. The trees will typically be of moderate size and reasonably well balanced. In particular, this ensures that the trees will never be so deep that you run the risk of StackOverflowExceptions caused by deep recursion.

Assuming these assumptions are correct, then the approach I would suggest is:

  • Do root reference equality check first. this quickly eliminates the case of either two nulls or the same tree being passed in for comparison with itself. Both are very common cases, and the reference equality check is extremely cheap.
  • Check the nulls next. Non-null is obviously not equal to null, which enables you to bail out early plus it establishes a non-null guarantee for later code! A very smart compiler could also theoretically use this guarantee to optimise away null pointer checks later (not sure if the JVM currently does this)
  • Check data reference equality and nulls next. This avoids descending all the way down the tree branches which you would do even in the case of unequal data if you went down the tree branches first.
  • Check data.equals() next. Again you want to check data equality before tree branches. You do this after checking for nulls since data.equals() is potentially more expensive and you want to guarantee you won't get a NullPointerException
  • Check the equality of branches recursively as the last step. It doesn't matter if you do left or right first unless there is a greater likelihood of one side being unequal, in which case you should check that side first. This might be the case if e.g. most changes were being appended to the right branch of the tree....
  • Make the comparison a static method. This is because you want to use it recursively in a way that will accept nulls as either of the two parameters (hence it isn't suitable for an instance method as this cannot be null). In addition, the JVM is very good at optimising static methods.

My implementation would therefore be something like:

public static boolean treeEquals(Node a, Node b) {
    // check for reference equality and nulls
    if (a == b) return true; // note this picks up case of two nulls
    if (a == null) return false;
    if (b == null) return false;

    // check for data inequality
    if (a.data != b.data) {
        if ((a.data == null) || (b.data == null)) return false;
        if (!(a.data.equals(b.data))) return false;
    }

    // recursively check branches
    if (!treeEquals(a.left, b.left)) return false;
    if (!treeEquals(a.right, b.right)) return false;

    // we've eliminated all possibilities for non-equality, so trees must be equal
    return true;
}
like image 22
mikera Avatar answered Oct 12 '22 06:10

mikera