Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the execution time of this function call changing?

Preface

This issue seems to only affect Chrome/V8, and may not be reproducible in Firefox or other browsers. In summary, the execution time of a function callback increases by an order of magnitude or more if the function is called with a new callback anywhere else.

Simplified Proof-of-Concept

Calling test(callback) arbitrarily many times works as expected, but once you call test(differentCallback), the execution time of the test function increases dramatically no matter what callback is provided (i.e., another call to test(callback) would suffer as well).

This example was updated to use arguments so as to not be optimized to an empty loop. Callback arguments a and b are summed and added to total, which is logged.

function test(callback) {     let start = performance.now(),         total = 0;      // add callback result to total     for (let i = 0; i < 1e6; i++)         total += callback(i, i + 1);      console.log(`took ${(performance.now() - start).toFixed(2)}ms | total: ${total}`); }  let callback1 = (a, b) => a + b,     callback2 = (a, b) => a + b;  console.log('FIRST CALLBACK: FASTER'); for (let i = 1; i < 10; i++)     test(callback1);  console.log('\nNEW CALLBACK: SLOWER'); for (let i = 1; i < 10; i++)     test(callback2);

Original post

I am developing a StateMachine class (source) for a library I'm writing and the logic works as expected, but in profiling it, I've run into an issue. I noticed that when I ran the profiling snippet (in global scope), it would only take about 8ms to finish, but if I ran it a second time, it would take up to 50ms and eventually balloon as high as 400ms. Typically, running the same named function over and over will cause its execution time to drop as the V8 engine optimizes it, but the opposite seems to be happening here.

I've been able to get rid of the problem by wrapping it in a closure, but then I noticed another weird side effect: Calling a different function that relies on the StateMachine class would break the performance for all code depending on the class.

The class is pretty simple - you give it an initial state in the constructor or init, and you can update the state with the update method, which you pass a callback that accepts this.state as an argument (and usually modifies it). transition is a method that is used to update the state until the transitionCondition is no longer met.

Two test functions are provided: red and blue, which are identical, and each will generate a StateMachine with an initial state of { test: 0 } and use the transition method to update the state while state.test < 1e6. The end state is { test: 1000000 }.

You can trigger the profile by clicking the red or blue button, which will run StateMachine.transition 50 times and log the average time the call took to complete. If you click the red or blue button repeatedly, you will see that it clocks in at less than 10ms without issue - but, once you click the other button and call the other version of the same function, everything breaks, and the execution time for both functions will increase by about an order of magnitude.

// two identical functions, red() and blue()  function red() {   let start = performance.now(),       stateMachine = new StateMachine({         test: 0       });    stateMachine.transition(     state => state.test++,      state => state.test < 1e6   );    if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';   else return performance.now() - start; }  function blue() {   let start = performance.now(),       stateMachine = new StateMachine({         test: 0       });    stateMachine.transition(     state => state.test++,      state => state.test < 1e6   );    if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';   else return performance.now() - start; }  // display execution time const display = (time) => document.getElementById('results').textContent = `Avg: ${time.toFixed(2)}ms`;  // handy dandy Array.avg() Array.prototype.avg = function() {   return this.reduce((a,b) => a+b) / this.length; }  // bindings document.getElementById('red').addEventListener('click', () => {   const times = [];   for (var i = 0; i < 50; i++)     times.push(red());        display(times.avg()); }),  document.getElementById('blue').addEventListener('click', () => {   const times = [];   for (var i = 0; i < 50; i++)     times.push(blue());        display(times.avg()); });
<script src="https://cdn.jsdelivr.net/gh/TeleworkInc/state-machine@bd486a339dca1b3ad3157df20e832ec23c6eb00b/StateMachine.js"></script>  <h2 id="results">Waiting...</h2> <button id="red">Red Pill</button> <button id="blue">Blue Pill</button>  <style> body{box-sizing:border-box;padding:0 4rem;text-align:center}button,h2,p{width:100%;margin:auto;text-align:center;font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol"}button{font-size:1rem;padding:.5rem;width:180px;margin:1rem 0;border-radius:20px;outline:none;}#red{background:rgba(255,0,0,.24)}#blue{background:rgba(0,0,255,.24)} </style>

Updates

Bug Report "Feature Request" filed (awaiting update) - See @jmrk's answers below for more details.

Ultimately, this behavior is unexpected and, IMO, qualifies as a nontrivial bug. The impact for me is significant - on Intel i7-4770 (8) @ 3.900GHz, my execution times in the example above go from an average of 2ms to 45ms (a 20x increase).

As for nontriviality, consider that any subsequent calls to StateMachine.transition after the first one will be unnecessarily slow, regardless of scope or location in the code. The fact that SpiderMonkey does not slow down subsequent calls to transition signals to me that there is room for improvement for this specific optimization logic in V8.

See below, where subsequent calls to StateMachine.transition are slowed:

// same source, several times  // 1 (function() {   let start = performance.now(),     stateMachine = new StateMachine({       test: 0     });    stateMachine.transition(state => state.test++, state => state.test < 1e6);    if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';   console.log(`took ${performance.now() - start}ms`); })();   // 2  (function() {   let start = performance.now(),     stateMachine = new StateMachine({       test: 0     });    stateMachine.transition(state => state.test++, state => state.test < 1e6);    if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';   console.log(`took ${performance.now() - start}ms`); })();  // 3 (function() {   let start = performance.now(),     stateMachine = new StateMachine({       test: 0     });    stateMachine.transition(state => state.test++, state => state.test < 1e6);    if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';   console.log(`took ${performance.now() - start}ms`); })();
<script src="https://cdn.jsdelivr.net/gh/TeleworkInc/state-machine@bd486a339dca1b3ad3157df20e832ec23c6eb00b/StateMachine.js"></script>

This performance decrease can be avoided by wrapping the code in a named closure, where presumably the optimizer knows the callbacks will not change:

var test = (function() {     let start = performance.now(),         stateMachine = new StateMachine({             test: 0         });        stateMachine.transition(state => state.test++, state => state.test < 1e6);        if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';     console.log(`took ${performance.now() - start}ms`); });  test(); test(); test();
<script src="https://cdn.jsdelivr.net/gh/TeleworkInc/state-machine@bd486a339dca1b3ad3157df20e832ec23c6eb00b/StateMachine.js"></script>

Platform Information

$ uname -a Linux workspaces 5.4.0-39-generic #43-Ubuntu SMP Fri Jun 19 10:28:31 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux  $ google-chrome --version Google Chrome 83.0.4103.116 
like image 772
C. Lewis Avatar asked Jul 02 '20 20:07

C. Lewis


People also ask

Does function call increase execution time?

Solution 1. In general yes, every time you call another method the CPU registers need to be stored and restored and parameters have to be pushed on the stack.

How do you limit the execution time of a function call?

the. vik's answer would be to use the with statement to give the timeout function some syntactic sugar: import signal from contextlib import contextmanager class TimeoutException(Exception): pass @contextmanager def time_limit(seconds): def signal_handler(signum, frame): raise TimeoutException("Timed out!") signal.

Does calling a function takes time?

Yes method calls slow down the code execution a tiny little bit, if they a not inlined by the c#-compiler or the jit-compiler. However, unless your code runs in a loop and is executed a million times or so, you should really focus on producing clean, understandable and maintainable code.

What happens when a function call occurs?

A function is a set of code that performs a specific task and can be used whenever needed just by calling it. While using multiple function calls or recursion, the concept of a function call is very necessary to be known, for better understanding of the code.

How to understand a function call?

While using multiple function calls or recursion, the concept of a function call is very necessary to be known, for better understanding of the code. Before understanding the working of function call you need some prerequisite knowledge about Program Execution in the CPU, Program Stack, Stack Frame (Activation Record).

What happens when a function is called from a stack frame?

Now, whenever a function is called a new stack frame is created with all the function’s data and this stack frame is pushed in the program stack, and the stack pointer that always points the top of the program stack points the stack frame pushed as it is on the top of the program stack. Stack Frame is pushed into stack.

What is%call execute in SAS?

CALL EXECUTE is a relatively new DATA step function which interacts with the SAS Macro Facility. It allows one to send character data to the macro facility for immediate macro execution during the execution of the DATA step. It was introduced in SAW3 6.07 and documented in Technical Report P-222 Changes and Enhancements to Base SAS Software.

Why measuring elapsed time vs CPU time is important?

Measuring elapsed time vs. CPU time is also important when considering multi-threaded programs that run on multiple cores. A multi-threaded program can run “faster”, but use more CPU time than a single-threaded solution. Scheduling not only affects how the times should be measured, but can also invalidate caches.


Video Answer


1 Answers

V8 developer here. It's not a bug, it's just an optimization that V8 doesn't do. It's interesting to see that Firefox seems to do it...

FWIW, I don't see "ballooning to 400ms"; instead (similar to Jon Trent's comment) I see about 2.5ms at first, and then around 11ms.

Here's the explanation:

When you click only one button, then transition only ever sees one callback. (Strictly speaking it's a new instance of the arrow function every time, but since they all stem from the same function in the source, they're "deduped" for type feedback tracking purposes. Also, strictly speaking it's one callback each for stateTransition and transitionCondition, but that just duplicates the situation; either one alone would reproduce it.) When transition gets optimized, the optimizing compiler decides to inline the called function, because having seen only one function there in the past, it can make a high-confidence guess that it's also always going to be that one function in the future. Since the function does extremely little work, avoiding the overhead of calling it provides a huge performance boost.

Once the second button is clicked, transition sees a second function. It must get deoptimized the first time this happens; since it's still hot it'll get reoptimized soon after, but this time the optimizer decides not to inline, because it's seen more than one function before, and inlining can be very expensive. The result is that from this point onwards, you'll see the time it takes to actually perform these calls. (The fact that both functions have identical source doesn't matter; checking that wouldn't be worth it because outside of toy examples that would almost never be the case.)

There's a workaround, but it's something of a hack, and I don't recommend putting hacks into user code to account for engine behavior. V8 does support "polymorphic inlining", but (currently) only if it can deduce the call target from some object's type. So if you construct "config" objects that have the right functions installed as methods on their prototype, you can get V8 to inline them. Like so:

class StateMachine {   ...   transition(config, maxCalls = Infinity) {     let i = 0;     while (       config.condition &&       config.condition(this.state) &&       i++ < maxCalls     ) config.transition(this.state);      return this;   }   ... }  class RedConfig {   transition(state) { return state.test++ }   condition(state) { return state.test < 1e6 } } class BlueConfig {   transition(state) { return state.test++ }   condition(state) { return state.test < 1e6 } }  function red() {   ...   stateMachine.transition(new RedConfig());   ... } function blue() {   ...   stateMachine.transition(new BlueConfig());   ... } 

It might be worth filing a bug (crbug.com/v8/new) to ask if the compiler team thinks that this is worth improving. Theoretically it should be possible to inline several functions that are called directly, and branch between the inlined paths based on the value of the function variable that's being called. However I'm not sure there are many cases where the impact is as pronounced as in this simple benchmark, and I know that recently the trend has been towards inlining less rather than more, because on average that tends to be the better tradeoff (there are drawbacks to inlining, and whether it's worth it is necessarily always a guess, because the engine would have to predict the future in order to be sure).

In conclusion, coding with many callbacks is a very flexible and often elegant technique, but it tends to come at an efficiency cost. (There are other varieties of inefficiency: e.g. a call with an inline arrow function like transition(state => state.something) allocates a new function object each time it's executed; that just so happens not to matter much in the example at hand.) Sometimes engines might be able to optimize away the overhead, and sometimes not.

like image 139
jmrk Avatar answered Sep 22 '22 23:09

jmrk