Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Microoptimization: iterating with local variable vs. class member

I thought I will save some time if I declare iterating variable once as a class member:

struct Foo {
  int i;
  void method1() {
    for(i=0; i<A; ++i) ...
  }
  void method2() {
    for(i=0; i<B; ++i) ...
  }
} foo;

however, this seems to be cca 20% faster

struct Foo {
  void method1() {
    for(int i=0; i<A; ++i) ...
  }
  void method2() {
    for(int i=0; i<B; ++i) ...
  }
} foo;

in this code

void loop() { // Arduino loops
  foo.method1();
  foo.method2();
}

Can you explain the performance difference?

(I need to run many simple paralel "processes" on Arduino where such microoptimalization makes a difference.)

like image 312
Jan Turoň Avatar asked Mar 07 '15 11:03

Jan Turoň


Video Answer


2 Answers

When you declare your loop variable inside a loop, it is scoped very narrowly. The compiler is free to keep it in a register all the time, so it does not get committed to memory even once.

When you declare your loop variable as an instance variable, the compiler has no such flexibility. It must keep the variable in memory, in case some of your methods would want to examine its state. For example, if you do this in your first code example

void method2() {
    for(i=0; i<B; ++i) { method3(); }
}
void method3() {
    printf("%d\n", i);
}

the value of i in method3 must be changing as the loop progresses. The compiler has no way around committing all its side effects to memory. Moreover, it cannot assume that i stayed the same when you come back from method3, further increasing the number of memory accesses.

Dealing with updates in memory requires a lot more CPU cycles than performing updates to register-based variables. That is why it is always a good idea to keep your loop variables scoped down to the loop level.

like image 72
Sergey Kalinichenko Avatar answered Nov 07 '22 09:11

Sergey Kalinichenko


Can you explain the performance difference?

The most plausible explanation I could come up for this performance difference is:

Data member i is declared on the global memory, which cannot be kept in the register all the time, hence operations on it would be way slower than on the loop variable i due to a very broad scope (The data member i has to cater for all the member functions of the class).

@DarioOO adds:

In addition the compiler is not free to store it temporary in a register because method3() could throw an exception leaving the object in a unwanted state (because theoretically no one prevent to you to write int k=this->i; for(k=0;k<A;k++)method3(); this->i=k;. That code would be almost as fast as local variable but you have to keep into account when method3() throws (I believe when there is the guarantee it does not throw the compiler will optimize that with -O3 or -O4 to be verified)

like image 37
shauryachats Avatar answered Nov 07 '22 07:11

shauryachats