Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intel AVX-512: how to set the EVEX.z bit

The EVEX.z bit is used in AVX-512 in conjunction with the k registers to control masking. If the z bit is 0, it's merge-masking and if the z bit is 1 the zero elements in the k register are zeroed in the output.

The syntax looks like this:

VPSUBQ zmm0{k2}{z},zmm1,zmm2

where {z} represents the z bit.

But how do you set or test the EVEX.z bit? I've searched every resource I can find but I haven't found an answer.

like image 507
RTC222 Avatar asked Mar 20 '20 16:03

RTC222


People also ask

What are AVX-512 instructions?

The AVX-512 instruction set increases the size of a CPU's register to enhance its performance. This boost in performance enables CPUs to crunch numbers faster, allowing users to run video/audio compression algorithms at faster speeds.

Is AVX-512 good?

According to Yee, the AVX-512 implementation in Zen 4 is unexpectedly good, despite relying on the double pumping of 256-bit units. Most of the operations are fast (they don't have bad latencies), and the bandwidth in terms of 512-bit instructions processed per cycle is also good in the context of using 256-bit units.

Does Intel Xeon support AVX?

These processors succeed the Intel Xeon processor v4 family based on the Broadwell (BDW) microarchitecture, which supports instructions from MMX and SSE to AVX and AVX2.

When was AVX introduced?

They were proposed by Intel in March 2008 and first supported by Intel with the Sandy Bridge processor shipping in Q1 2011 and later by AMD with the Bulldozer processor shipping in Q3 2011. AVX provides new features, new instructions and a new coding scheme.


1 Answers

As I understand it, what they mean is that VPSUBQ zmm0{k2}{z},zmm1,zmm2 and
VPSUBQ zmm0{k2},zmm1,zmm2 are two different instructions, whose encoding differs in a single bit, called the "z bit". (It's specifically part of the EVEX prefix to the instruction. Wikipedia documents all the fields)

So you "set the z bit" by specifying {z} in your assembler source, telling the assembler to generate an instruction with the corresponding bit set. This is documented lots of places, like Intel's vol.2 instruction set manual, and somewhat in Intel's intrinsics guide with mask (merge-masking) vs. maskz (zero-masking) versions of most intrinsics)

It is not a physical bit in the CPU state like the direction flag or something, that would persist from one instruction to the next. It doesn't make sense to "test" it.


To illustrate, here's what I get by assembling both versions:

00000000  62F1F5CAFBC2      vpsubq zmm0{k2}{z},zmm1,zmm2
00000006  62F1F54AFBC2      vpsubq zmm0{k2},zmm1,zmm2

Note the encodings differ in the high bit of the fourth byte. That's your "z bit".


Maybe you were thinking that you could "set" or "clear" the z bit at runtime, thus changing the masking effect of subsequent instructions? Since it's part of the encoding of each instruction, not the CPU state, that way of thinking only works if you were JITing the instructions on the fly or using self-modifying code.

In "normal" ahead-of-time code, you'll have to write the code in both versions, once with {z} instructions and once without. Use a conditional jump to decide which version to execute.

like image 54
Nate Eldredge Avatar answered Oct 02 '22 07:10

Nate Eldredge