Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Array with unique values: indexOf vs new Set

Assuming we have an array with repeated items like so:

var items = ['a', 'd', 'e', 'b', 'c', 'd', 'e', 'f', 'g', 'd', 'f', 'g', 'd', 'j', 'k', 'l', 'd', 'e', 'c', 'd', 'e', 'f', 'g', 'd','c', 'd', 'e', 'f', 'g', 'd'];

What would be the fastest way to have a new array with the values, given that you must loop through the items (might make no sense, but that's the scenario).

Option 1:

var list = [];
items.forEach(function(item) {
    if(list.indexOf(item) == -1)
      list.push(item);
});

Option 2:

var list = [];
items.forEach(function(item) {
    list.push(item);
});

list = Array.from(new Set(list));

Now I have done some testing with console.time and it shows that the option 2 is 5 times faster that the option 1. But I am not sure how trustable this console.time is. Any insight? Is the indexOf what's making the option 1 slower?

Fiddle: https://jsfiddle.net/q9opqvsm/

Edit: Another question: if option 2 is faster, should I change my code from option 1 => option 2. If not, why?

like image 387
yBrodsky Avatar asked Oct 18 '25 19:10

yBrodsky


1 Answers

"Is the indexOf what's making the option 1 slower?"

Yes. indexOf is O(n) meaning you have a loop inside your loop giving you overall O(n2). indexOf is equivalent of doing something like this:

function indexOf(item, array) {
    for (var i=0; i < array.length; i++) {
        if (array[i] === item) {
            return true;
        }
    }
    return false;
}

Which you can see has to, in the worse case (the item isn't already in the array) iterate the whole array. There's no way around it. If you are searching the array for a value you have to look at every item until you either find it, or run out of items.

In option 2 looking up a value in a set is O(1) and Array.from is O(n), so you are O(n) overall.

Making a set is somewhat equivalent to doing something like this (note this doesn't actually produce a set, but an object, so it's not exactly the same):

function makeSet(array) {
    var set = {};
    for (var i=0; i < array.length; i++) {
        if (set[array[i]] === undefined) {   // indexing `set` is O(1)
            set[array[i]] = true;
        }
    }
}

So it's O(n) overall. And creating an array from that is just a case of iterating the set and loading it into an array, which is also O(n). So it's O(n) overall.

Another question: if option 1 is faster, should I change my code from option 1 => option 2. If not, why?

Option 1 isn't faster, but if we pretend it is then the answer is that it depends. Option 1 certainly won't scale as well as option 2, but that doesn't mean option 1 might not be faster for small enough arrays (although I doubt it). Either way, this is premature optimization. If your code is running slow, and you profile your code and identify this part as the bottleneck, then you should worry about it.

Edit:

Small typo, I meant if option 2 is faster. There are no bottlenecks,

So the same arguments about premature optimization still apply. But personally, I'd probably change it. It seems fairly low impact and if anything it might, arguably, have a clearer intent with option 2.

Although - do consider browser support for Set. It's relatively new and not supported by older browsers. See here.

like image 159
Matt Burland Avatar answered Oct 21 '25 09:10

Matt Burland