I have discovered that the Intel compiler does not generate return value optimization for std::array objects. The following code, which happen to be in the inner loop of my program is not optimized as it could.
std::array<double, 45> f(const std::array<double, 45>& y) {
auto dy_dt = std::array<double, 45>( );
...
return dy_dt;
}
I have figured out that this behaviour comes from the fact that my standard library implementation does not explicitly define a copy constructor for std::array. The following code demonstrates that:
class Test {
public:
Test() = default;
Test(const Test& x);
};
Test f() {
auto x = Test( );
return x;
}
When you compile it with
icpc -c -std=c++11 -qopt-report=2 test.cpp -o test.o
the report file shows
INLINE REPORT: (f(Test *)) [1] main.cpp(7,10)
which proves that the compiler generates RVO (the signature of f is changed so it can put the newly created object on the stack of the calling site). But if you comment out the line that declares Test(const Test& x);
, the report file shows
INLINE REPORT: (f()) [1] main.cpp(7,10)
which proves that RVO is not generated.
In 12.8.31 of the C++11 standard that defines RVO, the example they give has a copy constructor. So, is this a "bug" of the Intel compiler or a conforming implementation of the standard?
This program causes undefined behaviour with no diagnostic required, due to violation of the One Definition Rule.
A copy-constructor is odr-used when returning by value -- even if copy elision takes place.
A non-inline function being odr-used means that exactly one definition of the function must appear in the program. However you provided none, and your declaration of the copy-constructor suppresses the compiler-generated definition.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With