Here is a SSCCE:
class Vec final {
public:
float data[4];
inline Vec(void) {}
inline ~Vec(void) {}
};
Vec operator*(float const& scalar, Vec const& vec) {
Vec result;
#if 1
for (int k=0;k<4;++k) result.data[k]=scalar*vec.data[k];
#else
float const*__restrict src = vec.data;
float *__restrict dst = result.data;
for (int k=0;k<4;++k) dst[k]=scalar*src[k];
#endif
return result;
}
int main(int /*argc*/, char* /*argv*/[]) {
Vec vec;
Vec scaledf = 2.0f * vec;
return 0;
}
When compiling, MSVC 2013 informs me (/Qvec-report:2
) that
main.cpp(11) : info C5002: loop not vectorized due to reason '1200'
This means that the "[l]oop contains loop-carried data dependences".
I have noticed that commenting either the constructor or the destructor for Vec
(edit: or defaulting them, e.g. Vec()=default;
) causes it to vectorize successfully. My question: why?
Note: Toggling the #if
will also make it work. The __restrict
is important.
Note: Changing float const& scalar
to float const scalar
causes the vectorization to report 1303
(vectorization wouldn't be a win), I suspect because the reference can be passed directly into an SSE register while the pass-by-value needs another copy.
Why do you declare an empty non virtual destructor inline ~Vec(void) {}
with an empty default constructor inline Vec(void) {}
?
As a result the compiler does not generate default copy constructor. Thus the code return result;
can't be compiled without it because this requires to copy result into a temporary returned object (that is may not what you want).
Either define a copy constructor, or don't define the empty constructor and destructor at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With