Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove duplicate objects from an array using javascript

I am trying to figure out an efficient way to remove objects that are duplicates from an array and looking for the most efficient answer. I looked around the internet everything seems to be using primitive data... or not scalable for large arrays. This is my current implementation which is can be improved and want to try to avoid labels.

 Test.prototype.unique = function (arr, artist, title, cb) {
        console.log(arr.length);
        var n, y, x, i, r;
        r = [];      
        o: for (i = 0, n = arr.length; i < n; i++) {

          for (x = 0, y = r.length; x < y; x++) {

                if (r[x].artist == arr[i].artist && r[x].title == arr[i].title) {
                    continue o;
                }
            }
            r.push(arr[i]);
        }

        cb(r);
    };

and the array looks something like this:

[{title: sky, artist: jon}, {title: rain, artist: Paul}, ....]

Order does not matter, but if sorting makes it more efficient then I am up for the challenge...

and for people who do not know o is a label and it is just saying jump back to the loop instead of pushing to the new array.

Pure javascript please no libs.

ANSWERS SO FAR:

The Performance Test for the answers below: http://jsperf.com/remove-duplicates-for-loops

like image 383
Lion789 Avatar asked Oct 21 '13 17:10

Lion789


2 Answers

I see, the problem there is that the complexity is squared. There is one trick to do it, it's simply by using "Associative arrays".

You can get the array, loop over it, and add the value of the array as a key to the associative array. Since it doesn't allow duplicated keys, you will automatically get rid of the duplicates.

Since you are looking for title and artist when comparing, you can actually try to use something like:

var arrResult = {};
for (i = 0, n = arr.length; i < n; i++) {
    var item = arr[i];
    arrResult[ item.title + " - " + item.artist ] = item;
}

Then you just loop the arrResult again, and recreate the array.

var i = 0;
var nonDuplicatedArray = [];    
for(var item in arrResult) {
    nonDuplicatedArray[i++] = arrResult[item];
}

Updated to include Paul's comment. Thanks!

like image 175
Henrique Feijo Avatar answered Sep 19 '22 02:09

Henrique Feijo


Here is a solution that works for me.

Helper functions:

// sorts an array of objects according to one field
// call like this: sortObjArray(myArray, "name" );
// it will modify the input array
sortObjArray = function(arr, field) {
    arr.sort(
        function compare(a,b) {
            if (a[field] < b[field])
                return -1;
            if (a[field] > b[field])
                return 1;
            return 0;
        }
    );
}

// call like this: uniqueDishes = removeDuplicatesFromObjArray(dishes, "dishName");
// it will NOT modify the input array
// input array MUST be sorted by the same field (asc or desc doesn't matter)
removeDuplicatesFromObjArray = function(arr, field) {
    var u = [];
    arr.reduce(function (a, b) {
        if (a[field] !== b[field]) u.push(b);
        return b;
    }, []);
    return u;
}

and then simply call:

        sortObjArray(dishes, "name");
        dishes = removeDuplicatesFromObjArray(dishes, "name");
like image 25
Nico Avatar answered Sep 22 '22 02:09

Nico