Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to auto-vectorize range-based for loops?

A similar question was posted on SO for g++ that was rather vague, so I thought I'd post a specific example for VC++12 / VS2013 to which we can hopefully get an answer.

cross-link: g++ , range based for and vectorization

MSDN gives the following as an example of a loop that can be vectorized:

for (int i=0; i<1000; ++i)
{       
    A[i] = A[i] + 1;
}

(http://msdn.microsoft.com/en-us/library/vstudio/jj658585.aspx)

Here is my version of a range-based analogue to the above, a c-style monstrosity, and a similar loop using std::for_each. I compiled with the /Qvec-report:2 flag and added the compiler messages as comments:

#include <vector>
#include <algorithm>

int main()
{
    std::vector<int> vec(1000, 1);

    // simple range-based for loop
    {
        for (int& elem : vec)
        {
            elem = elem + 1;
        }
    } // info C5002 : loop not vectorized due to reason '1304'

    // c-style iteration
    {
        int * begin = vec.data();
        int * end = begin + vec.size();

        for (int* it = begin; it != end; ++it)
        {
            *it = *it + 1;
        }
    } // info C5001: loop vectorized

    // for_each iteration
    {
        std::for_each(vec.begin(), vec.end(), [](int& elem)
        {
            elem = elem + 1;
        });
    } // (no compiler message provided)

    return 0;
}

Only the c-style loop gets vectorized. Reason 1304 is as follows as per the MSDN docs:

1304: Loop includes assignments that are of different sizes.

It gives the following as an example of code that would trigger a 1304 message:

void code_1304(int *A, short *B)
{
    // Code 1304 is emitted when the compiler detects
    // different sized statements in the loop body.
    // In this case, there is an 32-bit statement and a
    // 16-bit statement.

    // In cases like this consider splitting the loop into loops to 
    // maximize vector register utilization.

    for (int i=0; i<1000; ++i)
    {
        A[i] = A[i] + 1;
        B[i] = B[i] + 1;
    }
}

I'm no expert but I can't see the relationship. Is this just buggy reporting? I've noticed that none of my range-based loops are getting vectorized in my actual program. What gives?

(In case this is buggy behavior I'm running VS2013 Professional Version 12.0.21005.1 REL)

EDIT: Bug report posted: https://connect.microsoft.com/VisualStudio/feedback/details/807826/range-based-for-loops-are-not-vectorized

like image 869
quant Avatar asked Nov 06 '13 00:11

quant


1 Answers

Posted bug report here:

https://connect.microsoft.com/VisualStudio/feedback/details/807826/range-based-for-loops-are-not-vectorized

Response:

Hi, thanks for the report.

Vectorizing range-based-for-loop-y code is something we are actively making better. We'll address vectorizing this, plus enabling auto-vectorization for other C++ language & library features in future releases of the compiler.

The emission of reason code 1304 (on x64) and reason code 1301 (on x86) are artifacts of compiler internals. The details of that, for this particular code, is not important.

Thanks for the report! I am closing this MSConnect item. Feel free to respond if you need anything else.

Eric Brumer Microsoft Visual C++ Team

like image 112
quant Avatar answered Oct 09 '22 23:10

quant