I thought I will save some time if I declare iterating variable once as a class member:
struct Foo {
int i;
void method1() {
for(i=0; i<A; ++i) ...
}
void method2() {
for(i=0; i<B; ++i) ...
}
} foo;
however, this seems to be cca 20% faster
struct Foo {
void method1() {
for(int i=0; i<A; ++i) ...
}
void method2() {
for(int i=0; i<B; ++i) ...
}
} foo;
in this code
void loop() { // Arduino loops
foo.method1();
foo.method2();
}
Can you explain the performance difference?
(I need to run many simple paralel "processes" on Arduino where such microoptimalization makes a difference.)
When you declare your loop variable inside a loop, it is scoped very narrowly. The compiler is free to keep it in a register all the time, so it does not get committed to memory even once.
When you declare your loop variable as an instance variable, the compiler has no such flexibility. It must keep the variable in memory, in case some of your methods would want to examine its state. For example, if you do this in your first code example
void method2() {
for(i=0; i<B; ++i) { method3(); }
}
void method3() {
printf("%d\n", i);
}
the value of i
in method3
must be changing as the loop progresses. The compiler has no way around committing all its side effects to memory. Moreover, it cannot assume that i
stayed the same when you come back from method3
, further increasing the number of memory accesses.
Dealing with updates in memory requires a lot more CPU cycles than performing updates to register-based variables. That is why it is always a good idea to keep your loop variables scoped down to the loop level.
Can you explain the performance difference?
The most plausible explanation I could come up for this performance difference is:
Data member i
is declared on the global memory, which cannot be kept in the register all the time, hence operations on it would be way slower than on the loop variable i
due to a very broad scope (The data member i
has to cater for all the member functions of the class).
@DarioOO adds:
In addition the compiler is not free to store it temporary in a register because
method3()
could throw an exception leaving the object in a unwanted state (because theoretically no one prevent to you to writeint k=this->i; for(k=0;k<A;k++)method3(); this->i=k;
. That code would be almost as fast as local variable but you have to keep into account whenmethod3()
throws (I believe when there is the guarantee it does not throw the compiler will optimize that with-O3
or-O4
to be verified)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With