Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

vc++ no longer vectorize simple for loops with range-based syntax

Tags:

People also ask

What is a ranged based for loop?

Remarks. Use the range-based for statement to construct loops that must execute through a range, which is defined as anything that you can iterate through—for example, std::vector , or any other C++ Standard Library sequence whose range is defined by a begin() and end() .

What is a ranged for loop c++?

Range-based for loop in C++ is added since C++ 11. It executes a for loop over a range. Used as a more readable equivalent to the traditional for loop operating over a range of values, such as all elements in a container.

What is vectorization in C?

Vectorization means that the compiler detects that your independent instructions can be executed as one SIMD instruction. Usual example is that if you do something like for(i=0; i<N; i++){ a[i] = a[i] + b[i]; }

Does a range-based for loop make a copy?

When used with a (non-const) object that has copy-on-write semantics, the range-based for loop may trigger a deep copy by (implicitly) calling the non-const begin() member function.


Before replacing a lot of my "old" for loops with range based for loops, I ran some test with visual studio 2013:

std::vector<int> numbers;

for (int i = 0; i < 50; ++i) numbers.push_back(i);

int sum = 0;

//vectorization
for (auto number = numbers.begin(); number != numbers.end(); ++number) sum += *number;

//vectorization
for (auto number = numbers.begin(); number != numbers.end(); ++number) {
    auto && ref = *number;
    sum += ref;
}

//definition of range based for loops from http://en.cppreference.com/w/cpp/language/range-for
//vectorization
for (auto __begin = numbers.begin(),
    __end = numbers.end();
    __begin != __end; ++__begin) {
    auto && ref = *__begin;
    sum += ref;
}

//no vectorization :(
for (auto number : numbers) sum += number;

//no vectorization :(
for (auto& number : numbers) sum += number;

//no vectorization :(
for (const auto& number : numbers) sum += number;

//no vectorization :(
for (auto&& number : numbers) sum += number;

printf("%f\n", sum);

looking at the disassembly, standard for loops were all vectorized:

00BFE9B0  vpaddd      xmm1,xmm1,xmmword ptr [eax]  
00BFE9B4  add         ecx,4  
00BFE9B7  add         eax,10h  
00BFE9BA  cmp         ecx,edx  
00BFE9BC  jne         main+140h (0BFE9B0h)  

but range based for loops were not :

00BFEAC6  add         esi,dword ptr [eax]  
00BFEAC8  lea         eax,[eax+4]  
00BFEACB  inc         ecx  
00BFEACC  cmp         ecx,edi  
00BFEACE  jne         main+256h (0BFEAC6h)  

Is there any reason why the compiler couldn't vectorize these loops ?

I really would like to use the new syntax, but loosing vectorization is too bad.

I just saw this question, so I tried the /Qvec-report:2 flag, giving another reason:

loop not vectorized due to reason '1200'

that is:

Loop contains loop-carried data dependences that prevent vectorization. Different iterations of the loop interfere with each other such that vectorizing the loop would produce wrong answers, and the auto-vectorizer cannot prove to itself that there are no such data dependences.

Is this the same bug ? (I also tried with the last vc++ compiler "Nov 2013 CTP")

Should I report it on MS connect too ?

edit

Du to comments, I did the same test with a raw int array instead of a vector, so no iterator class is involved, just raw pointers.

Now all loops are vectorized except the two "simulated range-based" loops.

Compiler says this is due to reason '501':

Induction variable is not local; or upper bound is not loop-invariant.

I don't get what's going on...

const size_t size = 50;
int numbers[size];

for (size_t i = 0; i < size; ++i) numbers[i] = i;

int sum = 0;

//vectorization
for (auto number = &numbers[0]; number != &numbers[0] + size; ++number) sum += *number;

//vectorization
for (auto number = &numbers[0]; number != &numbers[0] + size; ++number) {
    auto && ref = *number;
    sum += ref;
}

//definition of range based for loops from http://en.cppreference.com/w/cpp/language/range-for
//NO vectorization ?!
for (auto __begin = &numbers[0],
    __end = &numbers[0] + size;
    __begin != __end; ++__begin) {
    auto && ref = *__begin;
    sum += ref;
}

//NO vectorization ?!
for (auto __begin = &numbers[0],
    __end = &numbers[0] + size;
    __begin != __end; ++__begin) {
    auto && ref = *__begin;
    sum += ref;
}

//vectorization ?!
for (auto number : numbers) sum += number;

//vectorization ?!
for (auto& number : numbers) sum += number;

//vectorization ?!
for (const auto& number : numbers) sum += number;

//vectorization ?!
for (auto&& number : numbers) sum += number;

printf("%f\n", sum);