Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.forEach vs Object.keys().forEach performance on sparse arrays

Tell me if I'm wrong: array.forEach(callbackFunction) is suited for sparse arrays. It executes callbackFunction not for each index between zero and the array length, but only for the keys which are actually in the array. And (tell me if I'm wrong) those keys are exactly what Object.keys(array) will give me. Hence (tell me why I'm wrong) it shouldn't make a difference if the .forEach method is called on array itself or on Object.keys(array). So, why on earth is there this performance difference - as if, in one case, a giant pointless loop from zero to length would be executed, but not in the other case.

Snippet showing performance difference:

function doNothing(){}
CONSOLE = document.getElementById('console');

arr = [];
arr[49888999] = 42;

start = performance.now();
arr.forEach(doNothing);
duration1 = performance.now() - start;

start = performance.now();
Object.keys(arr).forEach(doNothing);
duration2 = performance.now() - start;

CONSOLE.textContent = [duration1, duration2].join('\n');
<pre id='console'></pre>

Snippet showing that the callback function IS CALLED ONLY ONCE in BOTH cases

console1 = document.getElementById('console1');
console2 = document.getElementById('console2');
function doNothingVerbose1(){
  console1.textContent = 1 + (+console1.textContent);
}
function doNothingVerbose2(){
  console2.textContent = 1 + (+console2.textContent);
}

arr = [];
arr[49888999] = 42;

start = performance.now();
arr.forEach(doNothingVerbose1);
duration1 = performance.now() - start;

start = performance.now();
Object.keys(arr).forEach(doNothingVerbose2);
duration2 = performance.now() - start;

console.log(duration1, duration2);
~~~~~ 1 ~~~~~
<pre id='console1'>0</pre>
~~~~~ 2 ~~~~~
<pre id='console2'>0</pre>

UPDATE

I just did a test to find out whether or not the above arr=[];arr[49888999]=42; is an actual sparse array, i.e. has much less memory footprint compared to doing arr=new Array(49889000). And yes, that is the case. Doing this hundreds of times in a loop, the sparse version takes a couple of seconds but doesn't crash, but the new Array(50 million) version crashes the fiddle. So if it's not stored as a 'normal C++ array' in the engine then the engine must "have" Object.keys of the array, so why doesn't the engine make efficient use of it? I might have a too simplistic view of what a JS engine has to do; is it wrong to say that the engine must "have" Object.keys because it "has" a sparse array implementation backing our variable arr in some fashion? Maybe someone actually working on a browser/JS engine can shed some light here.

above test on jsperf

like image 515
mathheadinclouds Avatar asked Oct 27 '25 05:10

mathheadinclouds


1 Answers

Ok, ok, ok - so it's just one of those things one has to live with; I didn't want to hear that, but that is the right answer.

I'm going to continue to not read specs, and be bewildered at times. No, I'm not recommending that behavior, it's just the way I roll. Trying it out on the console just makes more sense to me, it's definitely more fun, while specs tend to make me fall asleep. Thankfully, people are different, and not everybody is like that.

Maybe a more interesting question is how to deal with the phenomenon in practice. If, for example, I have to deal with a 'sparse Array' as in "2 items of product 51472 and 1 item of product 81369", I'll use an object ({}) with keys 51472 and 81369, and not an array ([]).

Making it an array just because all keys happen to be non-negative integers is a bad idea the worst idea of the last 10 thousand years - because you then have .forEach, which is a FALSE FRIEND

2 related questions:

Why are we allowed to create sparse arrays in JavaScript

What use cases are there in JavaScript for Sparse Arrays?

like image 61
mathheadinclouds Avatar answered Oct 29 '25 20:10

mathheadinclouds