Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intel AVX : Why is there no 256-bits version of dot product for double precision floating point variables? [closed]

In another question on SO we tried (and succeeded) to find a way to replace the AVX missing instruction:

 __m256d _mm256_dp_pd(__m256d m1, __m256d m2, const int mask);

Anyone knows the reason why this instruction is missing ? Partial answer here.

like image 455
gleeen.gould Avatar asked Apr 16 '13 09:04

gleeen.gould


1 Answers

The underlying reason for this and various other AVX limitations is that architecturally AVX is little more than two SSE execution units side by side - you will notice that virtually no AVX instructions operate horizontally across the boundary between the two 128 bit halves of a vector (which is particularly annoying in the case of vpalignr). In general you effectively just get two 128 bit SSE operations in parallel, which is useful for the majority of instructions which just operate in an element-wise fashion, but not as useful as a proper 256 bit SIMD implementation.

like image 188
Paul R Avatar answered Nov 10 '22 22:11

Paul R