I'm a student and I'm learning for techniques of vectorization. I'm trying to make compiler vectorize a function which multiplies two matricies (each matrix consists of elements that are matricies of equal size). The code looks as follows:
#define f_dim1 2000
#define f_dim2 240
#define s_dim1 240
#define s_dim2 2000
#define i_dim1 4
#define i_dim2 4
void automaticallyBuilt(float* firstMatrix, float* secondMatrix, float* result) {
for (int i = 0; i < f_dim2; i++) { // rows in frist matrix
for (int j = 0; j < s_dim1; j++) { // columns in second matrix
for (int o = 0; o < f_dim1; o++) { // row element of first matrix = column element of second
for (int k = 0; k < i_dim2; k++) { // rows in inner matrix
for (int l = 0; l < i_dim1; l++) { // columns in inner matrix
for (int h = 0; h < i_dim2; h++) { // row element of inner = column element of inner
*(result + i*s_dim1*i_dim2*i_dim1 + j*i_dim2*i_dim1 + k*i_dim1 + l) +=
*(firstMatrix + i*f_dim1*i_dim2*i_dim1 + o*i_dim2*i_dim1 + k*i_dim1 + h) *
*(secondMatrix + o*s_dim1*i_dim2*i_dim1 + j*i_dim2*i_dim1 + h*i_dim1 + l);
// smth like result[i][j][k][l] += firstMatrix[i][o][k][h] * secondMatrix[o][j][h][l];
}
}
}
}
}
}
}
In attempt to make compiler vectorize it I modified this code as follows:
#define f_dim1 2000
#define f_dim2 240
#define s_dim1 240
#define s_dim2 2000
#define i_dim1 4
#define i_dim2 4
void automaticallyVectorized(float* firstMatrix, float* secondMatrix, float* result) {
for (int i = 0; i < f_dim2; i++) { // rows in frist matrix
for (int o = 0; o < f_dim1; o++) { // row element of first matrix = column element of second
for (int j = 0; j < s_dim1; j++) { // columns in second matrix
for (int k = 0; k < i_dim2; k++) { // rows in inner matrix
for (int h = 0; h < i_dim2; h++) { // row element of inner = column element of inner
float firstMatrixInnerRowElement = *(firstMatrix + i*f_dim1*i_dim2*i_dim1 + o* i_dim2*i_dim1 + k*i_dim1 + h);
float* resultInnerRow = result + i*s_dim1*i_dim2*i_dim1 + j*i_dim2*i_dim1 + k*i_dim1;
float* secondMatrixInnerColumnElementRow = secondMatrix + o*s_dim1*i_dim2*i_dim1 + j*i_dim2*i_dim1 + h*i_dim1;
for (int l = 0; l < i_dim1; l++) { // columns in inner matrix
resultInnerRow[l] += firstMatrixInnerRowElement * secondMatrixInnerColumnElementRow[l];
}
}
}
}
}
}
}
During build compiler emits the following message:
code.cpp(161) : info C5002: Loop not vectorized due to reason: "1204"
Code 1204 is not mentioned here. Furthermore, I haven't found any information about it with Google.
I tried to use __restrict modifier but had no luck.
I use Visual Studio 2019, but I tried to build it with VS 2017 with the same results.
Can anybody explain what this reason code means? I don't believe that nobody faced this issue before.
This reason code means that loops are nested too deeply
As Eljay suggested, I posted an issue on github using the link at the bottom of the documentation page. Here is the link to the answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With