My question concerns EVEX-encoded packed reg-reg instructions without rounding semantic which allow SAE control (Suppress All Exceptions), such as VMIN*, VCVTT*, VGETEXT*, VREDUCE*, VRANGE* etc. Intel declares SAE-awareness only with full 512bit vector length, e.g.
VMINPD xmm1 {k1}{z}, xmm2, xmm3
VMINPD ymm1 {k1}{z}, ymm2, ymm3
VMINPD zmm1 {k1}{z}, zmm2, zmm3{sae}
but I don't see a reason why SAE couldn't be applied to instructions where xmm or ymm registers are used.
In chapter 4.6.4 of Intel Instruction Set Extensions Programming Reference Table 4-7 says that in instructions without rounding semantic bit EVEX.b specifies that SAE is applied, and bits EVEX.L'L specify explicit vector length:
00b: 128bit (XMM)
01b: 256bit (YMM)
10b: 512bit (ZMM)
11b: reserved
so their combination should be legal.
However NASM assembles vminpd zmm1,zmm2,zmm3,{sae}
as 62F1ED185DCB, i.e. EVEX.L'L=00b, EVEX.b=1, which is disassembled back by NDISASM 2.12 as vminpd xmm1,xmm2,xmm3
NASM refuses to assemble vminpd ymm1,ymm2,ymm3,{sae}
and NDISASM disassembles 62F1ED385DCB (EVEX.L'L=01b, EVEX.b=1) as vminpd xmm1,xmm2,xmm3
I wonder how does Knights Landing CPU execute VMINPD ymm1, ymm2, ymm3{sae}
(assembled as 62F1ED385DCB, EVEX.L'L=01b, EVEX.b=1):
Your VMINPD ymm1, ymm2, ymm3{sae}
instruction is invalid. According to instruction set reference for MINPD in the Intel Architecture Instruction Set Extensions Programming Reference (February 2016) only the following encodings are allowed:
66 0F 5D /r MINPD xmm1, xmm2/m128 VEX.NDS.128.66.0F.WIG 5D /r VMINPD xmm1, xmm2, xmm3/m128 VEX.NDS.256.66.0F.WIG 5D /r VMINPD ymm1, ymm2, ymm3/m256 EVEX.NDS.128.66.0F.W1 5D /r VMINPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst EVEX.NDS.256.66.0F.W1 5D /r VMINPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst EVEX.NDS.512.66.0F.W1 5D /r VMINPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{sae}
Notice that only the last version is shown with a {sae}
suffix, meaning it's the only form of the instruction you're allowed to use it with. Just because the bits exists to encode a particular instruction doesn't mean its valid.
Also note that section 4.6.3, SAE Support in EVEX, makes it clear that SAE doesn't apply to 128-bit or 256-bit vectors:
The EVEX encoding system allows arithmetic floating-point instructions without rounding semantic to be encoded with the SAE attribute. This capability applies to scalar and 512-bit vector lengths, register-to-register only, by setting EVEX.b. When EVEX.b is set, “suppress all exceptions” is implied. [...]
I'm not sure however whether your hand crafted instruction would generate Invalid Opcode exception, if the EVEX.b bit will simply be ignored, or if the EVEX.L'L bits will be ignored. EVEX encoded VMINPD instructions belong to the Type E2 exception class, and according to Table 4-17, Type E2 Class Exception Conditions, the instruction can generate an #UD exception in any of the following cases:
- State requirement, Table 4-8 not met.
- Opcode independent #UD condition in Table 4-9.
- Operand encoding #UD conditions in Table 4-10.
- Opmask encoding #UD condition of Table 4-11.
- If EVEX.L’L != 10b (VL=512).
Only that last reason seems to apply here, but it would mean that your instruction would generate #UD exception with or without the {sae}
modifier. Since this seems to directly contradict the allowed encodings in the instruction summary, I'm not sure what would happen.
On Twitter, iximeow gives some addenda to Ross Ridge's answer above:
- ross ridge is right that the text is invalid, but the important detail is that
L'L
selects the specific SAE mode, so if you setL'L
to indicateymm
, you just get{rd-sae}
- this is to say, if you set
b
forsae
at all, the vector width is immediately fixed to 512 bits
vector widths are fixed to 512 bits*
*except for some cvt instructions where one operand is 512 bits and one operand is smaller
(@Pepijn's comment on Ross's answer already linked to those tweets; but I figured it's worth making this a separate answer, if only for visiblity.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With