How to disable vectorization in clang++?

Tags:

Consider the following small search function:

template <uint32_t N>
int32_t countsearch(const uint32_t *base, uint32_t needle) {
    uint32_t count = 0;
    #pragma clang loop vectorize(disable)
    for (const uint32_t *probe = base; probe < base + N; probe++) {
        if (*probe < needle)
            count++;
    }
    return count;
}

At -O2 or higher, clang vectorizes this search, e.g,. resulting in code like this (for 10 elements):

int countsearch<10u>(unsigned int const*, unsigned int):            # @int countsearch<10u>(unsigned int const*, unsigned int)
        vmovd   xmm0, esi
        vpbroadcastd    ymm0, xmm0
        vpbroadcastd    ymm1, dword ptr [rip + .LCPI0_0] # ymm1 = [2147483648,2147483648,2147483648,2147483648,2147483648,2147483648,2147483648,2147483648]
        vpxor   ymm2, ymm1, ymmword ptr [rdi]
        vpxor   ymm0, ymm0, ymm1
        vpcmpgtd        ymm0, ymm0, ymm2
        cmp     dword ptr [rdi + 32], esi
        vpsrld  ymm1, ymm0, 31
        vextracti128    xmm1, ymm1, 1
        vpsubd  ymm0, ymm1, ymm0
        vpshufd xmm1, xmm0, 78          # xmm1 = xmm0[2,3,0,1]
        vpaddd  ymm0, ymm0, ymm1
        vphaddd ymm0, ymm0, ymm0
        vmovd   eax, xmm0
        adc     eax, 0
        cmp     dword ptr [rdi + 36], esi
        adc     eax, 0
        vzeroupper
        ret

How can I disable this vectorization on the command line or using a #pragma in the code?

I tried the following command line arguments, none of which prevented the vectorization:

-disable-loop-vectorization 
-disable-vectorization
-fno-vectorize 
-fno-tree-vectorize

I also tried #pragma clang loop vectorize(disable) above the loop as you seen in the code above, without luck.

355

asked Jul 22 '18 04:07

BeeOnRope

1 Answers

Turn off SLP Vectorization:

clang++ -O2 -fno-slp-vectorize

Godbolt Link

answered Oct 11 '22 09:10

Justin

Related questions
                            
                                SFINAE: decltype on operator[]
                            
                                Why can't I have two accessors for the same element in tbb hash map?
                            
                                Parameter "size" of member operator new[] increases if class has destructor/delete[]
                            
                                Does the StereoBM class in opencv do rectification of the input images or frames?
                            
                                c++ with and without throw() in method/constructor signature for a custom exception
                            
                                Why can't this enable_if function template be specialized in VS2017?
                            
                                Boost 1.65.1 geometry distance strategy compile error with Visual Studio 2017
                            
                                About ODR-violations and template variables
                            
                                Rust interop with C++ std::string
                            
                                Passing the "this" pointer to other class/function in destructor
                            
                                OpenSSL SSL_read Failure (error:00000005:lib(0):func(0):DH lib)
                            
                                Using CMAKE how to stop the "Debug" and "Release" subdirectories
                            
                                std::make_unique's (and emplace, emplace_back's) awkward deduction for initializer_list arguments
                            
                                template template parameter of unknown type
                            
                                performance comparsion between vector and raw c-style array
                            
                                extern "C" Default argument works or not?
                            
                                In C++ can I pass a structure as a pointer without declaring it locally?
                            
                                Is the 16-bit math in this program invoking undefined behavior?
                            
                                Is there a way to avoid this warning from clang-tidy (fuchsia-default-arguments) while initializing a string?
                            
                                Why does emplace_back("Hello") call strlen?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to disable vectorization in clang++?

Tags:

c++

optimization

x86

vectorization

clang

BeeOnRope

People also ask

1 Answers

Justin

Recent Activity

Donate For Us