When encode instructioncmpw %ax -5
for x86-64, from Intel-instruction-set-reference-manual, I have two opcodes to choose:
3D iw CMP AX, imm16 I Valid Valid Compare imm16 with AX.
83 /7 ib CMP r/m16, imm8 MI Valid Valid Compare imm8 with r/m16.
So there will be two encoding results:
66 3d fb ff ; this for opcode 3d
66 83 f8 fb ; this for opcode 83
Then which one is better?
I tried some online-disassembler below
Both can disassemble to origin instruction. But why 6683fb00
also works and 663dfb
doesn't.
The x86-64 instructions are encoded one by one as a variable number of bytes for each. Each instruction's encoding consists of: an opcode. a register and/or address mode specifier consisting of the ModR/M byte and sometimes the scale-index-base (SIB) byte (if required)
The x86 opcode bytes are 8-bit equivalents of iii field that we discussed in simplified encoding. This provides for up to 512 different instruction classes, although the x86 does not yet use them all.
x86 opcodes are 1 byte for most common instructions, especially instructions which have existed since 8086. Instructions added later (e.g. like bsf and movsx in 386) often use 2-byte opcodes with a 0f escape byte.
The x86 opcode bytes are 8-bit equivalents of iiifield that we discussed in simplified encoding. This provides for up to 512 different instruction classes, although the x86 does not yet use them all. 5. x86 ADD Instruction Opcode Bit number zeromarked sspecifies the sizeof the operands the ADDinstruction operates upon:
The x86-64 instructions are encoded one by one as a variable number of bytes for each. Each instruction’s encoding consists of: a register and/or address mode specifier consisting of the ModR/M byte and sometimes the scale-index-base (SIB) byte (if required)
Instead, x86 uses a entirely different instruction formatto specify instruction with an immediate operand. There are three rules that apply: Encoding x86 immediate operands: If opcode high-order bit set to 1, then instruction has an immediate constant. There is no direction bit in the opcode:
Alternate Encodings for Instructions To shorten program code, Intel created alternate (shorter) encodings of some very commonly used instructions. For example, x86 provides a single byte opcode for add al, constant ; one-byte opcode and no MOD-REG-R/M byte add eax, constant ; one-byte opcode and no MOD-REG-R/M byte
Both encodings are the same length, so that doesn't help us decide.
However, as @Michael Petch commented, the imm16
encoding will cause an LCP stall in the decoders on Intel CPUs. (Because without the 66
operand-size prefix, it would be 3D imm32
, so the operand-size prefix changes the length of the rest of the instruction. This is why it's called a Length-Changing-Prefix stall. AFAIK, you'd get the same stall in 16bit code for using a 32bit immediate.)
The imm8
encoding doesn't cause a problem on any microarchitecture I know of, so favour it. See Agner Fog's microarch.pdf, and other links from the x86 tag wiki.
It can be worth using a longer instruction to avoid an LCP stall. (e.g. if you know the upper 16 bits of the register are zero or sign-extended, using 32bit operand size can avoid the LCP stall.)
Intel SnB-family CPUs have a uop cache, so instructions don't always have to be re-decoded before executing. Still, the uop cache is small, so it's worth it.
Of course, if you're tuning for AMD, then this isn't a factor. I forget if Atom and Silvermont decoders also have LCP stalls.
663d
is prefix+opcode for cmp ax, imm16
. 663dfb
doesn't "work" because it consumes the first byte of the following instruction. When the decoder see 66 3D
, it grabs the next 2 bytes from the instruction stream as the immediate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With