I am trying to vectorize the following function with clang according to this clang reference. It takes a vector of byte array and applies a mask according to this RFC. <pre class="prettyprint"><code>static void apply_mask(vector<uint8_t> &payload, uint8_t (&masking_key)[4]) { #pragma clang loop vectorize(enable) interleave(enable) for (size_t i = 0; i < payload.size(); i++) { payload[i] = payload[i] ^ masking_key[i % 4]; } } </code></pre> The following flags are passed to clang: <pre class="prettyprint"><code>-O3 -Rpass=loop-vectorize -Rpass-analysis=loop-vectorize </code></pre> However, the vectorization fails with the following error: <pre class="prettyprint"><code>WebSocket.cpp:5: WebSocket.h:14: In file included from boost/asio/io_service.hpp:767: In file included from boost/asio/impl/io_service.hpp:19: In file included from boost/asio/detail/service_registry.hpp:143: In file included from boost/asio/detail/impl/service_registry.ipp:19: c++/v1/vector:1498:18: remark: loop not vectorized: could not determine number of loop iterations [-Rpass-analysis] return this->__begin_[__n]; ^ c++/v1/vector:1498:18: error: loop not vectorized: failed explicitly specified loop vectorization [-Werror,-Wpass-failed] </code></pre> How do I vectorize this for loop?

Thanks to @PaulR and @PeterCordes. Unrolling the loop by a factor of 4 works. <pre class="prettyprint"><code>void apply_mask(vector<uint8_t> &payload, const uint8_t (&masking_key)[4]) { const size_t size = payload.size(); const size_t size4 = size / 4; size_t i = 0; uint8_t *p = &payload[0]; uint32_t *p32 = reinterpret_cast<uint32_t *>(p); const uint32_t m = *reinterpret_cast<const uint32_t *>(&masking_key[0]); #pragma clang loop vectorize(enable) interleave(enable) for (i = 0; i < size4; i++) { p32[i] = p32[i] ^ m; } for (i = (size4*4); i < size; i++) { p[i] = p[i] ^ masking_key[i % 4]; } } </code></pre> gcc.godbolt code

Vectorize a function in clang

static void apply_mask(vector<uint8_t> &payload, uint8_t (&masking_key)[4]) {
  #pragma clang loop vectorize(enable) interleave(enable)
  for (size_t i = 0; i < payload.size(); i++) {
    payload[i] = payload[i] ^ masking_key[i % 4];
  }
}

The following flags are passed to clang:

-O3
-Rpass=loop-vectorize
-Rpass-analysis=loop-vectorize

However, the vectorization fails with the following error:

WebSocket.cpp:5:
WebSocket.h:14:
In file included from boost/asio/io_service.hpp:767:
In file included from boost/asio/impl/io_service.hpp:19:
In file included from boost/asio/detail/service_registry.hpp:143:
In file included from boost/asio/detail/impl/service_registry.ipp:19:
c++/v1/vector:1498:18: remark: loop not vectorized: could not determine number
      of loop iterations [-Rpass-analysis]
    return this->__begin_[__n];
                 ^
c++/v1/vector:1498:18: error: loop not vectorized: failed explicitly specified
      loop vectorization [-Werror,-Wpass-failed]

How do I vectorize this for loop?

422

asked May 20 '16 16:05

rahul

1 Answers

Thanks to @PaulR and @PeterCordes. Unrolling the loop by a factor of 4 works.

void apply_mask(vector<uint8_t> &payload, const uint8_t (&masking_key)[4]) {
  const size_t size = payload.size();
  const size_t size4 = size / 4;
  size_t i = 0;
  uint8_t *p = &payload[0];
  uint32_t *p32 = reinterpret_cast<uint32_t *>(p);
  const uint32_t m = *reinterpret_cast<const uint32_t *>(&masking_key[0]);

#pragma clang loop vectorize(enable) interleave(enable)
  for (i = 0; i < size4; i++) {
    p32[i] = p32[i] ^ m;
  }

  for (i = (size4*4); i < size; i++) {
    p[i] = p[i] ^ masking_key[i % 4];
  }
}

gcc.godbolt code

190

answered Nov 06 '22 07:11

rahul

Related questions
                            
                                c++ program crashes when linked to two 3rd party shared libraries
                            
                                Increase C++ regex replace performance
                            
                                For loop index type deduction best practice
                            
                                Currying for templates in C++ metaprogramming
                            
                                Are C++11 stateful allocators interchangeable across type boundaries?
                            
                                Why does the following code cause the template instantiation?
                            
                                Is this incorrect code generation with arrays of __m256 values a clang bug?
                            
                                Qt/Qml and method overloads
                            
                                Can Docker help build executable that work in different platform
                            
                                c++11 use condition variable in signal handler
                            
                                Default advice for using C-style string literals vs. constructing unnamed std::string objects?
                            
                                std::sort vs intel ipp sort performance. what am I doing wrong?
                            
                                Why heterogeneous comparison lookup is not implemented for `at` and `operator []`?
                            
                                Where are the visual C++ redistributable packages installed?
                            
                                Dependencies on boost library don't have full path
                            
                                Invalid explicitly-specified argument in clang but successful compilation in gcc — who's wrong?
                            
                                Windows UDP sockets: recvfrom() fails with error 10054
                            
                                Golang calling CUDA library
                            
                                visual studio 2015, android command 'run-as' failed
                            
                                Can a std::array alias a fragment of a larger array?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Vectorize a function in clang

Tags:

c++

vector

simd

clang++

rahul

People also ask

1 Answers

rahul

Recent Activity

Donate For Us