If I have an example function like:
void func1(float a, float b, float c)
{
setA(a);
setB(b);
setC(c);
}
Which calls inlined functions:
inline void setA(float a){ m_a = a; m_isValid = false; }
inline void setB(float b){ m_b = b; m_isValid = false; }
inline void setC(float c){ m_c = c; m_isValid = false; }
Should I care about the "m_isValid = false" duplications or the compiler eliminates them by the optimization?
Yes, this is commonly known as Dead Store Elimination (read = load and write = store in compilers parlance).
In general, any useless operation can be optimized away by the compiler providing it can prove that you (the user) cannot notice it (within the bounds set up by the language).
For Dead Store Elimination in particular it is generally restricted to:
Some examples:
struct Foo { int a; int b; };
void opaque(Foo& x); // opaque, aka unknown definition
Foo foo() {
Foo x{1, 2};
x.a = 3;
return x; // provably returns {3, 2}
// thus equivalent to Foo foo() { return {3, 2}; }
}
Foo bar() {
Foo x{1, 2};
opaque(x); // may use x.a, so need to leave it at '1' for now
x.a = 3;
return x;
}
Foo baz() {
Foo x{1, 2};
opaque(x);
x.a = 1; // x.a may have been changed, cannot be optimized
return x;
}
Note that whether you store the same value consecutively or not has not importance, as long as the compiler can prove that a variable is not read between two store operations, it can eliminate the first safely.
A special case: by specification in C++, load/store to a volatile
cannot be optimized. This is so because volatile
was specified to allow interactions with the hardware, and thus the compiler cannot know a priori whether the hardware will read or write to the variable behind the program's back.
Another special case: for the purpose of optimizations, memory synchronization operations (fences, barriers, etc...) used in multi-threaded programs can also prevent this kind of optimizations. This is because, pretty much like in the volatile
case, the synchronization mean that another thread of execution may have modified the variable behind this thread's back.
Finally like all optimizations its effectiveness greatly depends on the knowledge of the context. If it is proven that opaque
either does not read or does not write to x.a
, then some stores may be optimized out (provable if the compiler can inspect the definition of opaque
), so in general it really depends on inlining and constant propagation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With