Let's say I have a class that's something like this:
class View
{
public:
View(DataContainer &c)
: _c(c)
{
}
inline Elem getElemForCoords(double x, double y)
{
int idx = /* some computation here... */;
return _c.data[idx];
}
private:
DataContainer& _c;
};
If I have a function using this class, is the compiler allowed to optimize it away entirely and just inline the data access?
Is the same still true if View::_c happens to be a std::shared_ptr?
Compiler optimization is generally implemented using a sequence of optimizing transformations, algorithms which take a program and transform it to produce a semantically equivalent output program that uses fewer resources or executes faster.
Compilers are free to optimize code so long as they can guarantee the semantics of the code are not changed.
Which property is the most important for an optimizing compiler? Layers in the cache hierarchy that are closer to the CPU are than layers that are farther from the CPU.
If I have a function using this class, is the compiler allowed to optimize it away entirely and just inline the data access?
Is the same still true if View::_c happens to be a std::shared_ptr?
Absolutely, yes, and yes; as long as it doesn't violate the as-if rule (as already pointed out by Pentadecagon). Whether this optimization really happens is a much more interesting question; it is allowed by the standard. For this code:
#include <memory>
#include <vector>
template <class DataContainer>
class View {
public:
View(DataContainer& c) : c(c) { }
int getElemForCoords(double x, double y) {
int idx = x*y; // some dumb computation
return c->at(idx);
}
private:
DataContainer& c;
};
template <class DataContainer>
View<DataContainer> make_view(DataContainer& c) {
return View<DataContainer>(c);
}
int main(int argc, char* argv[]) {
auto ptr2vec = std::make_shared<std::vector<int>>(2);
auto view = make_view(ptr2vec);
return view.getElemForCoords(1, argc);
}
I have verified, by inspecting the assembly code (g++ -std=c++11 -O3 -S -fwhole-program optaway.cpp
), that the View
class is like it is not there, it adds zero overhead.
Some unsolicited advice.
Inspect the assembly code of your programs; you will learn a lot and start worrying about the right things. shared_ptr
is a heavy-weight object (compared to, for example, unique_ptr
), partly because of all that multi-threading machinery under the hood. If you look at the assembly code, you will worry much more about the overhead of the shared pointer and less about element access. ;)
The inline
in your code is just noise, that function is implicitly inline anyway. Please don't trash your code with the inline keyword; the optimizer is free to treat it as whitespace anyway. Use link time optimization instead (-flto
with gcc). GCC and Clang are surprisingly smart compilers and generate good code.
Profile your code instead of guessing and doing premature optimization. Perf is a great tool.
Want speed? Measure. (by Howard Hinnant)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With