For testing purposes, I am writing short assembly snippets for Intel's Xeon Phi with the Icc inline assembler. Now I wanted to use masked vector instructions, but I fail at feeding them to the inline assembler.
For code like this:
vmovapd -64(%%r14, %%r10), %%zmm0{%%k1}
I get the error message
/tmp/icpc5115IWas_.s: Assembler messages:
/tmp/icpc5115IWas_.s:563: Error: junk `%k1' after register
I tried a lot of different combinations, but nothing worked. The compiler version is intel64/13.1up03 under Linux, using GAS syntax.
Edit: The code above actually works with non-extended assembler. So this:
__asm__("vmovapd -64(%r14, %r10), %zmm0{%k1} ")
works, while the following does not:
__asm__("vmovapd -64(%[src], %%r10), %%zmm0{%%k1} "
:
: [src]"r"(src)
:)
I guess it has something to do with the necessity to use a double % before register names in extended mode. But no, a single % for the k does not work either.
Intel AVX-512 is a set of new CPU instructions that impacts compute, storage, and network functions. The number 512 refers to the width, in bits, of the register file, which sets the parameters for how much data a set of instructions can operate upon at a time.
The __m512 data type is used to represent the contents of the extended SSE register, the ZMM register, used by the Intel® AVX-512 intrinsics. The __m512 data type can hold sixteen 32-bit floating-point values.
I asked the same question in the Intel Developer zone http://software.intel.com/en-us/forums/topic/499145#comment-1776563, the answer is, that in order to use the mask registers on the Xeon Phi in extended inline assembler, you have to use double curly braces around the mask register modifier.
vmovapd %%zmm30, (%%r15, %%r10){{%%k1}}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With