Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do arrays of numbers, more data sort faster than arrays of objects, fewer data in Javascript?

For my application in node.js I must sort elements of an array in descending order based on some numeric value (i.e. a numeric rank). Since my application is performance-critical, I decided to build my data structure so that sorting is optimized. I hypothesized that the fewer data contained per element in my array, the faster the sorts will be. To test my hypothesis, I ran the following on three different arrays of length 10000:

EDIT: Guys, it seems as though there was something flawed with my original testing. The first test takes significantly longer than the ones that follow. As such, I have modified my test code to have a 'buffer' sort before the actual sorts. Furthermore, I rotated the order of my tests for a fixed number of trials to reduce any bias that might result from the ordering of the tests themselves. I've modified the results accordingly.

Full source here: https://raw.githubusercontent.com/youngrrrr/js-array-sort-bench-test/master/arraySortTest.js

var buffer = [781197, ... ];
var sparseArray = [781197, ... ];
var sparseArray2 = [{'a' : 781197}, ...];
var denseArray = [{'a' : 781197, 'b': ['r', 'a', 'n', 'd', 'o', 'm'] }, ...];

/* buffer : for some reason, the first test always takes significantly longer than the others. I've added this to try to remove whatever bias there was before... */
console.time('buffer');
random.sort(compareSparse);
console.timeEnd('buffer');
console.log(buffer[0]); // prints "58"


/* sparseArray : an array whose elements are numbers */
console.time('sparse');
sparseArray.sort(compareSparse);
console.timeEnd('sparse');
console.log(sparseArray[0]); // prints "58"

/* sparseArray2 (not an accurate name, just got lazy) :
   an array whose elements are objects with a single key-value pair mapping
   an arbitrary name 'a' to a number (which we sort on) */
console.time('sparse2');
sparseArray2.sort(compareDense);
console.timeEnd('sparse2');
console.log(sparseArray2[0]); // prints "{ a: 58 }"

/* denseArray : an array whose elements are objects with two key-value
   pairs mapping an arbitrary key 'a' to a number (which we sort on) and
   another arbitrary key 'b' to an array (which is just supposed to be 
   extra data for the purpose of my hypothesis) */
console.time('dense');
denseArray.sort(compareDense);
console.timeEnd('dense');
console.log(denseArray[0]); // prints "{ a: 58, b: [ 'r', 'a', 'n', 'd', 'o', 'm' ] }"

function compareSparse(a, b) {
    if (a < b) {
        return -1;
    } else if (a > b) {
        return 1;   }
    else {
        return 0;
    }
}

function compareDense(a, b) {
    if (a.a < b.a) {
            return -1;
        } else if (a.a > b.a) {
            return 1;   }
        else {
            return 0;
        }
    }
}

Old test:

After 25 trials (I know, small sample size but I did this all manually) I got the following times for average sort time:

  • sparseArray: (24 + 23 + 21 + 23 + 21 + 22 + 22 + 22 + 22 + 22 + 21 + 20 + 22 + 24 + 24 + 21 + 22 + 22 + 25 + 23 + 24 + 23 + 21 + 21 + 23) / 25 = 22.32ms
  • sparseArray2: (4 + 4 + 4 + 4 + 4 + 5 + 5 + 5 + 5 + 4 + 6 + 5 + 5 + 4 + 5 + 4 + 4 + 4 + 5 + 6 + 4 + 5 + 4 + 4 + 5) / 25 = 4.56ms
  • denseArray: (5 + 5 + 4 + 5 + 5 + 5 + 5 + 5 + 5 + 6 + 5 + 5 + 4 + 4 + 5 + 5 + 5 + 4 + 5 + 5 + 6 + 5 + 5 + 5 + 4) / 25 = 4.88ms

New test:

After 25 trials (I know, small sample size but I did this all manually) I got the following times for average sort time:

  • sparseArray: (4+4+4+4+3+4+4+4+4+4+4+4+3+4+4)/15 = 3.867ms
  • sparseArray2: (4+4+4+6+5+4+4+4+4+5+5+4+5+5+5)/15 = 4.533ms
  • denseArray: (4+4+4+5+5+4+4+4+4+5+5+4+5+5+5)/15 = 4.466ms

So I've come to the following conclusions:

  • Arrays of numbers sort faster than arrays of objects whose values are numbers. This makes sense intuitively.
  • For some reason, and paradoxically so, more data in a particular element results in faster sorting than less data (as evidenced by sparseArray2 vs denseArray runtimes).

What I want to know is:

  • Are these conclusions backed by any documentation/something other than my testing? That is, did I reach the correct conclusions?
  • And why? Why do arrays of numbers sort faster than arrays of objects (makes sense intuitively, but what's the explanation behind this, if any)? Not only this, but why do arrays containing MORE data seem to sort faster than those containing less data?

And note, I'm not married to these conclusions or anything. The sample size is small and my testing has proven to be flawed before, so my results may very well just be the result of bad testing. Also, there seem to be various factors that I have no awareness about that could be affecting the results (as Ryan O'Hara pointed out in my earlier post). The point of this post is to discover any fact-based explanation for sorting behavior in Javascript.

Thanks for reading!

like image 752
youngrrrr Avatar asked Dec 27 '15 01:12

youngrrrr


1 Answers

Are these conclusions backed by any documentation/something other than my testing? That is, did I reach the correct conclusions?

The specifics of how .sort() is implemented is not required by any specification, therefore the performance aspects of .sort() are only to be discovered via performance testing interesting data sets in browsers or JS implementations of interest. Pretty much all performance questions are best answered with testing in the specific circumstances that matter to you. Generalizations outside of that can easily be misleading or wrong and do not necessarily apply to all configurations.

And why? Why do arrays of numbers sort faster than arrays of objects (makes sense intuitively, but what's the explanation behind this, if any)? Not only this, but why do arrays containing MORE data seem to sort faster than those containing less data?

The performance of a given sort with a custom comparison function is going to be governed by the following items:

  1. The length of the array. A longer array will require more sort comparisons.
  2. The smarts of the internal sort algorithm to reduce the number of sort comparisons as small as possible
  3. The performance of the custom sort function (how long it takes to execute a given sort comparison).

So, if you hold the custom sort function and the .sort() implementation you're using constant and the data in the array constant, then a longer array will take longer to sort.

But, if you change both 1. and 3. above (one in a favorable direction and one in a less favorable direction) as you are doing when you go from sorting an array of numbers to sorting an array of objects by a specific property value, then the delta in speed will dependent upon whether the net change is positive or negative which depends upon several things which are hard to predict outside of a very specific implementation and data set and a lot of testing (in other words, it could go either way).

For some test info on sorting an array of numbers vs. sorting a property from an array of objects, see http://jsperf.com/sort-value-vs-property. To no surprise, it is slightly faster to sort the array of numbers though not by a lot.

like image 101
jfriend00 Avatar answered Nov 07 '22 10:11

jfriend00