Does the OpenMP standard guarantee #pragma omp simd to work, i.e. should the compilation fail if the compiler can't vectorize the code?
#include <cstdint>
void foo(uint32_t r[8], uint16_t* ptr)
{
const uint32_t C = 1000;
#pragma omp simd
for (int j = 0; j < 8; ++j)
if (r[j] < C)
r[j] = *(ptr++);
}
gcc and clang fail to vectorize this but do not complain at all (unless you use -fopt-info-vec-optimized-missed
and the like).
No, it is not guaranteed. Relevant portions of the OpenMP 4.5 standard that I could find (emphasis mine):
(1.3) When any thread encounters a simd construct, the iterations of the loop associated with the construct may be executed concurrently using the SIMD lanes that are available to the thread.
(2.8.1) The simd construct can be applied to a loop to indicate that the loop can be transformed into a SIMD loop (that is, multiple iterations of the loop can be executed concurrently using SIMD instructions).
(Appendix C) The number of iterations that are executed concurrently at any given time is implementation defined.
(1.2.7) implementation defined: Behavior that must be documented by the implementation, and is allowed to vary among different compliant implementations. An implementation is allowed to define this behavior as unspecified.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With