I need to perform structural comparison on two Object[] arrays which may contain themselves: <pre class="prettyprint"><code>Object[] o1 = new Object[] { "A", null }; o1[1] = o1; Object[] o2 = new Object[] { "A", null }; o2[1] = o2; Arrays.deepEquals(o1, o2); // undefined behavior </code></pre> Unfortunately, the <code>deepEquals</code> doesn't work in this case. The example above should yield true. Is there an algorithm which can reliably calculate this? My idea is roughly as follows: <pre class="prettyprint"><code>List<Object> xs = new ArrayList<>(); List<Object> ys = new ArrayList<>(); boolean equal(Object[] o1, Object[] o2, List<Object> xs, List<Object> ys) { xs.add(o1); ys.add(o2); boolean result = true; for (int i = 0; i < o1.length; i++) { if (o1[i] instanceof Object[]) { int idx1 = xs.lastIndexOf(o1[i]); if (idx1 >= 0) { idx1 = xs.size() - idx1 - 1; } if (o2[i] instanceof Object[]) { int idx2 = xs.lastIndexOf(o2[i]); if (idx2 >= 0) { idx2 = ys.size() - idx2 - 1; } if (idx1 == idx2) { if (idx1 >= 0) { continue; } if (!equal(o1[i], o2[i], xs, ys)) { result = false; break; } } } } } xs.removeLast(); ys.removeLast(); return result; } </code></pre>

You could add all visited objects to a temporary <code>Map<Object, Object></code> structure to make sure, that you do not visit/inspect them again. The value is always a new Object, which will be used to replace already visited instances in your result lists. Every time you see an object, <ol> <li>Check, if the map contains the instance</li> <li>if not, put it to the map, the map value is a new Object</li> <li>if yes, use the map value (the unique, new Object) in your list (<code>xs</code> or <code>ys</code>)</li> </ol> In your example, the result lists should look like this (pseudo language): <pre class="prettyprint"><code>xs == {o1, "A", obj2} // obj2 == map.get(o2); ys == {o2, "A", obj1} // obj1 == map.get(o1); </code></pre> This will prevent from infinite loops.

Self-containing array deep equals

Tags:

java

arrays

algorithm

I need to perform structural comparison on two Object[] arrays which may contain themselves:

Object[] o1 = new Object[] { "A", null };
o1[1] = o1;

Object[] o2 = new Object[] { "A", null };
o2[1] = o2;

Arrays.deepEquals(o1, o2); // undefined behavior

Unfortunately, the deepEquals doesn't work in this case. The example above should yield true.

Is there an algorithm which can reliably calculate this?

My idea is roughly as follows:

List<Object> xs = new ArrayList<>();
List<Object> ys = new ArrayList<>();

boolean equal(Object[] o1, Object[] o2, List<Object> xs, List<Object> ys) {
   xs.add(o1);
   ys.add(o2);
   boolean result = true;
   for (int i = 0; i < o1.length; i++) {
       if (o1[i] instanceof Object[]) {
           int idx1 = xs.lastIndexOf(o1[i]);
           if (idx1 >= 0) { idx1 = xs.size() - idx1 - 1; }
           if (o2[i] instanceof Object[]) {
               int idx2 = xs.lastIndexOf(o2[i]);
               if (idx2 >= 0) { idx2 = ys.size() - idx2 - 1; }
               if (idx1 == idx2) {
                   if (idx1 >= 0) {
                       continue;
                   }
                   if (!equal(o1[i], o2[i], xs, ys)) {
                       result = false;
                       break;
                   }
               }
           }
       }
   }
   xs.removeLast();
   ys.removeLast();
   return result;
}

863

asked Mar 05 '12 07:03

akarnokd

2 Answers

As I mentioned in my comments above, your code has some compile errors, and you've left out a lot of it, which makes it hard to be 100% sure of exactly how it's supposed to work once the code is completed. But after finishing the code, fixing one clear typo (you wrote idx2 = xs.lastIndexOf(o2[i]), but I'm sure you meant idx2 = ys.lastIndexOf(o2[i])) and one thing that I think is a typo (I don't think that you meant for if (!equal(o1[i], o2[i], xs, ys)) to be nested inside if (idx1 == idx2)), removing some no-op code, and restructuring a bit (to a style that I find clearer; YMMV), I get this:

boolean equal(final Object[] o1, final Object[] o2)
{
    return _equal(o1, o2, new ArrayList<Object>(), new ArrayList<Object>());
}

private static boolean _equal(final Object[] o1, final Object[] o2,
                                 final List<Object> xs, final List<Object> ys)
{
    if(o1.length != o2.length)
        return false;

    xs.add(o1);
    ys.add(o2);
    try
    {
        for(int i = 0; i < o1.length; i++)
        {
            if(o1[i] == null && o2[i] == null)
                continue;
            if(o1[i] == null || o2[i] == null)
                return false;
            if(o1[i].equals(o2[i]))
                continue;
            if(! (o1[i] instanceof Object[]) || ! (o2[i] instanceof Object[]))
                return false;

            final int idx1 = xs.lastIndexOf(o1[i]);

            if(idx1 >= 0 && idx1 == ys.lastIndexOf(o2[i]))
                continue;

            if(! _equal((Object[])o1[i], (Object[])o2[i], xs, ys))
                return false;
        }

        return true;
    }
    finally
    {
        xs.remove(xs.size() - 1);
        ys.remove(ys.size() - 1);
    }
}

which mostly works. The logic is, whenever it gets two Object[]s, it checks to see if if it's currently comparing each of them higher up in the stack and, if so, it checks to see if the topmost stack-frame that's comparing one of them is also the topmost stack-frame that's comparing the other. (That is the logic you intended, right?)

The only serious bug I can see is in this sort of situation:

// a one-element array that directly contains itself:
final Object[] a = { null }; a[0] = a;
// a one-element array that contains itself via another one-element array:
final Object[][] b = { { null } }; b[0][0] = b;

// should return true (right?); instead, overflows the stack:
equal(a, b, new ArrayList<Object>(), new ArrayList<Object>());

You see, in the above, the last element of xs will always be a, but the last element of ys will alternate between b and b[0]. In each recursive call, xs.lastIndexOf(a) will always be the greatest index of xs, while ys.lastIndexOf(b) or ys.lastIndexOf(b[0]) (whichever one is needed) will always be one less than the greatest index of ys.

The problem is, the logic shouldn't be, "the topmost comparison of o1[i] is in the same stack-frame as the topmost comparison of o2[i]"; rather, it should be, "there exists some stack-frame — any stack-frame at all — that is comparing o1[i] to o2[i]". But for efficiency, we can actually use the logic "there is, or has ever been, a stack-frame that is/was comparing o1[i] to o2[i]"; and we can use a Set of pairs of arrays rather than two Lists of arrays. To that end, I wrote this:

private static boolean equal(final Object[] a1, final Object[] a2)
{
    return _equal(a1, a2, new HashSet<ArrayPair>());
}

private static boolean _equal
    (final Object[] a1, final Object[] a2, final Set<ArrayPair> pairs)
{
    if(a1 == a2)
        return true;
    if(a1.length != a2.length)
        return false;

    if(! pairs.add(new ArrayPair(a1, a2)))
    {
        // If we're here, then pairs already contained {a1,a2}. This means
        // either that we've previously compared a1 and a2 and found them to
        // be equal (in which case we obviously want to return true), or
        // that we're currently comparing them somewhere higher in the
        // stack and haven't *yet* found them to be unequal (in which case
        // we still want to return true: if it turns out that they're
        // unequal because of some later difference we haven't reached yet,
        // that's fine, because the comparison higher in the stack will
        // still find that).

        return true;
    }

    for(int i = 0; i < a1.length; ++i)
    {
        if(a1[i] == a2[i])
            continue;
        if(a1[i] == null || a2[i] == null)
            return false;
        if(a1[i].equals(a2[i]))
            continue;
        if(! (a1[i] instanceof Object[]) || ! (a2[i] instanceof Object[]))
            return false;
        if(! _equal((Object[]) a1[i], (Object[]) a2[i], pairs))
            return false;
    }

    return true;
}

private static final class ArrayPair
{
    private final Object[] a1;
    private final Object[] a2;

    public ArrayPair(final Object[] a1, final Object[] a2)
    {
        if(a1 == null || a2 == null)
            throw new NullPointerException();

        this.a1 = a1;
        this.a2 = a2;
    }

    @Override
    public boolean equals(final Object that)
    {
        if(that instanceof ArrayPair)
            if(a1 == ((ArrayPair)that).a1)
                return a2 == ((ArrayPair)that).a2;
            else 
                if(a1 == ((ArrayPair)that).a2)
                    return a2 == ((ArrayPair)that).a1;
                else
                    return false;
        else
            return false;
    }

    @Override
    public int hashCode()
        { return a1.hashCode() + a2.hashCode(); }
}

It should be clear that the above cannot result in infinite recursion, because if the program has a finite number of arrays, then it has a finite number of pairs of arrays, and only one stack-frame at a time can be comparing a given pair of arrays (since, once a pair begins to be getting compared, it's added to pairs, and any future attempt to compare that pair will immediately return true), which means that the total stack depth is finite at any given time. (Of course, if the number of arrays is huge, then the above can still overflow the stack; the recursion is bounded, but so is the maximum stack size. I'd recommend, actually, that the for-loop be split into two for-loops, one after the other: the first time, skip all the elements that are arrays, and the second time, skip all the elements that aren't. This can avoid expensive comparisons in many cases.)

It should also be clear that the above will never return false when it should return true, since it only returns false when it finds an actual difference.

Lastly, I think it should be clear that the above will never return true when it should return false, since for every pair of objects, one full loop is always made over all the elements. This part is trickier to prove, but in essence, we've defined structural equality in such a way that two arrays are only structurally unequal if we can find some difference between them; and the above code does eventually examine every element of every array it encounters, so if there were a findable difference, it would find it.

Notes:

I didn't worry about arrays of primitives, int[] and double[] and so on. Adam's answer raises the possibility that you would want these to be compared elementwise as well; if that's needed, it's easily added (since it wouldn't require recursion: arrays of primitives can't contain arrays), but the above code just uses Object.equals(Object) for them, which means reference-equality.
The above code assumes that Object.equals(Object) implements a symmetric relation, as its contract specifies. In reality, however, that contract is not always fulfilled; for example, new java.util.Date(0L).equals(new java.sql.Timestamp(0L)) is true, while new java.sql.Timestamp(0L).equals(new java.util.Date(0L)) is false. If order matters for your purposes — if you want equal(new Object[]{java.util.Date(0L)}, new Object[]{java.sql.Timestamp(0L)}) to be true and equal(new Object[]{java.sql.Timestamp(0L)}, new Object[]{java.util.Date(0L)}) to be false — then you'll want to change ArrayPair.equals(Object), and probably ArrayPair.hashCode() as well, to care about which array is which.

141

answered Sep 21 '22 10:09

ruakh

You could add all visited objects to a temporary Map<Object, Object> structure to make sure, that you do not visit/inspect them again. The value is always a new Object, which will be used to replace already visited instances in your result lists.

Every time you see an object,

Check, if the map contains the instance
if not, put it to the map, the map value is a new Object
if yes, use the map value (the unique, new Object) in your list (xs or ys)

In your example, the result lists should look like this (pseudo language):

xs == {o1, "A", obj2}     // obj2 == map.get(o2);
ys == {o2, "A", obj1}     // obj1 == map.get(o1);

This will prevent from infinite loops.

answered Sep 23 '22 10:09

Andreas Dolk

Related questions
                            
                                Spring JPA REST sort by nested property
                            
                                No response when using IBM MQ client jars in application to connect to IBM MQ server
                            
                                Error : java.lang.UnsatisfiedLinkError with roboelectric and realm
                            
                                How to replace the Camunda Authentication with OAuth2 provider
                            
                                Optaplanner remove customer from working VRP solution
                            
                                Android USB Accessory Multi Thread
                            
                                Library shutdown routine that works well in a 'normal' Java application and in a web application
                            
                                How to add javax.activation.DataSource to App Engine devserver?
                            
                                Injecting a generic factory in Guice
                            
                                Encryption of image files on Android -- Cipher(Output|Input)Stream problems
                            
                                How to deal with a very large text file?
                            
                                query hangs oracle 10g
                            
                                Sun JDK /Open JDK on Cygwin
                            
                                handling "are you sure you want to navigate away from this page" Msg in Selenium 2.0
                            
                                Tricky try-catch java code
                            
                                Voice operated Software development tools
                            
                                How can I install openjdk-7-jdk on lucid 10.04 LTS? [closed]
                            
                                WebDriver FireFoxProfile UserAgent switching with FireFoxDriver
                            
                                How does one print total number of pages in a JTextPane footer?
                            
                                How to get "active editor" in Eclipse plugin?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With