For my application in node.js I must sort elements of an array in descending order based on some numeric value (i.e. a numeric rank). Since my application is performance-critical, I decided to build my data structure so that sorting is optimized. I hypothesized that the fewer data contained per element in my array, the faster the sorts will be. To test my hypothesis, I ran the following on three different arrays of length 10000:
EDIT: Guys, it seems as though there was something flawed with my original testing. The first test takes significantly longer than the ones that follow. As such, I have modified my test code to have a 'buffer' sort before the actual sorts. Furthermore, I rotated the order of my tests for a fixed number of trials to reduce any bias that might result from the ordering of the tests themselves. I've modified the results accordingly.
Full source here: https://raw.githubusercontent.com/youngrrrr/js-array-sort-bench-test/master/arraySortTest.js
var buffer = [781197, ... ];
var sparseArray = [781197, ... ];
var sparseArray2 = [{'a' : 781197}, ...];
var denseArray = [{'a' : 781197, 'b': ['r', 'a', 'n', 'd', 'o', 'm'] }, ...];
/* buffer : for some reason, the first test always takes significantly longer than the others. I've added this to try to remove whatever bias there was before... */
console.time('buffer');
random.sort(compareSparse);
console.timeEnd('buffer');
console.log(buffer[0]); // prints "58"
/* sparseArray : an array whose elements are numbers */
console.time('sparse');
sparseArray.sort(compareSparse);
console.timeEnd('sparse');
console.log(sparseArray[0]); // prints "58"
/* sparseArray2 (not an accurate name, just got lazy) :
an array whose elements are objects with a single key-value pair mapping
an arbitrary name 'a' to a number (which we sort on) */
console.time('sparse2');
sparseArray2.sort(compareDense);
console.timeEnd('sparse2');
console.log(sparseArray2[0]); // prints "{ a: 58 }"
/* denseArray : an array whose elements are objects with two key-value
pairs mapping an arbitrary key 'a' to a number (which we sort on) and
another arbitrary key 'b' to an array (which is just supposed to be
extra data for the purpose of my hypothesis) */
console.time('dense');
denseArray.sort(compareDense);
console.timeEnd('dense');
console.log(denseArray[0]); // prints "{ a: 58, b: [ 'r', 'a', 'n', 'd', 'o', 'm' ] }"
function compareSparse(a, b) {
if (a < b) {
return -1;
} else if (a > b) {
return 1; }
else {
return 0;
}
}
function compareDense(a, b) {
if (a.a < b.a) {
return -1;
} else if (a.a > b.a) {
return 1; }
else {
return 0;
}
}
}
Old test:
After 25 trials (I know, small sample size but I did this all manually) I got the following times for average sort time:
New test:
After 25 trials (I know, small sample size but I did this all manually) I got the following times for average sort time:
So I've come to the following conclusions:
What I want to know is:
And note, I'm not married to these conclusions or anything. The sample size is small and my testing has proven to be flawed before, so my results may very well just be the result of bad testing. Also, there seem to be various factors that I have no awareness about that could be affecting the results (as Ryan O'Hara pointed out in my earlier post). The point of this post is to discover any fact-based explanation for sorting behavior in Javascript.
Thanks for reading!
Are these conclusions backed by any documentation/something other than my testing? That is, did I reach the correct conclusions?
The specifics of how .sort()
is implemented is not required by any specification, therefore the performance aspects of .sort()
are only to be discovered via performance testing interesting data sets in browsers or JS implementations of interest. Pretty much all performance questions are best answered with testing in the specific circumstances that matter to you. Generalizations outside of that can easily be misleading or wrong and do not necessarily apply to all configurations.
And why? Why do arrays of numbers sort faster than arrays of objects (makes sense intuitively, but what's the explanation behind this, if any)? Not only this, but why do arrays containing MORE data seem to sort faster than those containing less data?
The performance of a given sort with a custom comparison function is going to be governed by the following items:
So, if you hold the custom sort function and the .sort()
implementation you're using constant and the data in the array constant, then a longer array will take longer to sort.
But, if you change both 1. and 3. above (one in a favorable direction and one in a less favorable direction) as you are doing when you go from sorting an array of numbers to sorting an array of objects by a specific property value, then the delta in speed will dependent upon whether the net change is positive or negative which depends upon several things which are hard to predict outside of a very specific implementation and data set and a lot of testing (in other words, it could go either way).
For some test info on sorting an array of numbers vs. sorting a property from an array of objects, see http://jsperf.com/sort-value-vs-property. To no surprise, it is slightly faster to sort the array of numbers though not by a lot.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With