Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare arrays as (multi-) sets

I'm looking for an efficient way to find out whether two arrays contain same amounts of equal elements (in the == sense), in any order:

foo = {/*some object*/}
bar = {/*some other object*/}

a = [1,2,foo,2,bar,2]
b = [bar,2,2,2,foo,1]

sameElements(a, b) --> true

PS. Note that pretty much every solution in the thread uses === and not == for comparison. This is fine for my needs though.

like image 251
georg Avatar asked Jun 04 '13 09:06

georg


1 Answers

Update 5 I posted a new answer with a different approach.

Update

I extended the code to have the possibility of either checking by reference or equality

just pass true as second parameter to do a reference check.

Also I added the example to Brunos JSPerf

  • It runs at about 11 ops/s doing a reference check

I will comment the code as soon(!) as I get some spare time to explain it a bit more, but at the moment don't have the time for that, sry. Done

Update 2.

Like Bruno pointed out in the comments sameElements([NaN],[NaN]) yields false

In my opinion this is the correct behaviour as NaN is ambigious and should always lead to a false result,at least when comparing NaN.equals(NaN). But he had quite a good point.

Whether

[1,2,foo,bar,NaN,3] should be equal to [1,3,foo,NaN,bar,2] or not.

Ok.. honestly I'm a bit torn whether it should or not, so i added two flags.

  • Number.prototype.equal.NaN
    • If true
      • NaN.equals(NaN) //true
  • Array.prototype.equal.NaN
    • If true
      • [NaN].equals([NaN],true) //true
      • note this is only for reference checks. As a deep check would invoke Number.prototype.equals anyway

Update 3:

Dang i totally missed 2 lines in the sort function.

Added

 r[0] = a._srt; //DANG i totally missed this line
 r[1] = b._srt; //And this.

Line 105 in the Fiddle

Which is kind of important as it determines the consistent order of the Elements.

Update 4
I tried to optimize the sort function a bit, and managed to get it up to about 20 ops/s.

Below is the updated code, as well as the updated fiddle =)

Also i chose to mark the objects outside the sort function, it doesn't seem to make a performance difference anymore, and its more readable


Here is an approach using Object.defineProperty to add equals functions to

Array,Object,Number,String,Boolean's prototype to avoid typechecking in one function for performance reasons. As we can recursively call .equals on any element.

But of course checking Objects for equality may cause performance issues in big Objects.

So if anyone feels unpleasant manipulating native prototypes, just do a type check and put it into one function

Object.defineProperty(Boolean.prototype, "equals", {
        enumerable: false,
        configurable: true,
        value: function (c) {
            return this == c; //For booleans simply return the equality
        }
    });

Object.defineProperty(Number.prototype, "equals", {
        enumerable: false,
        configurable: true,
        value: function (c) {
            if (Number.prototype.equals.NaN == true && isNaN(this) && c != c) return true; //let NaN equals NaN if flag set
            return this == c; // else do a normal compare
        }
    });

Number.prototype.equals.NaN = false; //Set to true to return true for NaN == NaN

Object.defineProperty(String.prototype, "equals", {
        enumerable: false,
        configurable: true,
        value: Boolean.prototype.equals //the same (now we covered the primitives)
    });

Object.defineProperty(Object.prototype, "equals", {
        enumerable: false,
        configurable: true,
        value: function (c, reference) {
            if (true === reference) //If its a check by reference
                return this === c; //return the result of comparing the reference
            if (typeof this != typeof c) { 
                return false; //if the types don't match (Object equals primitive) immediately return
            }
            var d = [Object.keys(this), Object.keys(c)],//create an array with the keys of the objects, which get compared
                f = d[0].length; //store length of keys of the first obj (we need it later)
            if (f !== d[1].length) {//If the Objects differ in the length of their keys
                return false; //immediately return
            }
            for (var e = 0; e < f; e++) { //iterate over the keys of the first object
                if (d[0][e] != d[1][e] || !this[d[0][e]].equals(c[d[1][e]])) {
                    return false; //if either the key name does not match or the value does not match, return false. a call of .equal on 2 primitives simply compares them as e.g Number.prototype.equal gets called
                }
            }
            return true; //everything is equal, return true
        }
    });
Object.defineProperty(Array.prototype, "equals", {
        enumerable: false,
        configurable: true,
        value: function (c,reference) {

            var d = this.length;
            if (d != c.length) {
                return false;
            }
            var f = Array.prototype.equals.sort(this.concat());
            c = Array.prototype.equals.sort(c.concat(),f)

            if (reference){
                for (var e = 0; e < d; e++) {
                    if (f[e] != c[e] && !(Array.prototype.equals.NaN && f[e] != f[e] && c[e] != c[e])) {
                        return false;
                    }
                }                
            } else {
                for (var e = 0; e < d; e++) {
                    if (!f[e].equals(c[e])) {
                        return false;
                    }
                }
            }
            return true;

        }
    });

Array.prototype.equals.NaN = false; //Set to true to allow [NaN].equals([NaN]) //true

Object.defineProperty(Array.prototype.equals,"sort",{
  enumerable:false,
  value:function sort (curr,prev) {
         var weight = {
            "[object Undefined]":6,         
            "[object Object]":5,
            "[object Null]":4,
            "[object String]":3,
            "[object Number]":2,
            "[object Boolean]":1
        }
        if (prev) { //mark the objects
            for (var i = prev.length,j,t;i>0;i--) {
                t = typeof (j = prev[i]);
                if (j != null && t === "object") {
                     j._pos = i;   
                } else if (t !== "object" && t != "undefined" ) break;
            }
        }

        curr.sort (sorter);

        if (prev) {
            for (var k = prev.length,l,t;k>0;k--) {
                t = typeof (l = prev[k]);
                if (t === "object" && l != null) {
                    delete l._pos;
                } else if (t !== "object" && t != "undefined" ) break;
            }
        }
        return curr;

        function sorter (a,b) {

             var tStr = Object.prototype.toString
             var types = [tStr.call(a),tStr.call(b)]
             var ret = [0,0];
             if (types[0] === types[1] && types[0] === "[object Object]") {
                 if (prev) return a._pos - b._pos
                 else {
                     return a === b ? 0 : 1;
                 }
             } else if (types [0] !== types [1]){
                     return weight[types[0]] - weight[types[1]]
             }



            return a>b?1:a<b?-1:0;
        }

    }

});

With this we can reduce the sameElements function to

function sameElements(c, d,referenceCheck) {
     return c.equals(d,referenceCheck);  //call .equals of Array.prototype.
}

Note. of course you could put all equal functions into the sameElements function, for the cost of the typechecking.

Now here are 3 examples: 1 with deep checking, 2 with reference checking.

var foo = {
    a: 1,
    obj: {
        number: 2,
        bool: true,
        string: "asd"
    },
    arr: [1, 2, 3]
};

var bar = {
    a: 1,
    obj: {
        number: 2,
        bool: true,
        string: "asd"
    },
    arr: [1, 2, 3]
};

var foobar = {
    a: 1,
    obj: {
        number: 2,
        bool: true,
        string: "asd"
    },
    arr: [1, 2, 3, 4]
};

var a = [1, 2, foo, 2, bar, 2];
var b = [foo, 2, 2, 2, bar, 1];
var c = [bar, 2, 2, 2, bar, 1];

So these are the Arrays we compare. And the output is

  1. Check a and b with references only.

    console.log (sameElements ( a,b,true)) //true As they contain the same elements

  2. Check b and c with references only

    console.log (sameElements (b,c,true)) //false as c contains bar twice.

  3. Check b and c deeply

    console.log (sameElements (b,c,false)) //true as bar and foo are equal but not the same

  4. Check for 2 Arrays containing NaN

    Array.prototype.equals.NaN = true;
    console.log(sameElements([NaN],[NaN],true)); //true.
    Array.prototype.equals.NaN = false;

Demo on JSFiddle

like image 144
Moritz Roessler Avatar answered Sep 30 '22 15:09

Moritz Roessler