What would be the shortest Intel x86-64 opcode for setting rax
to 1?
I tried xor rax,rax
and inc al
(in NASM syntax); which gives the 5-byte opcode 48 31 c0
fe c0
. Would it be possible to achieve the same result in 4 bytes?
You can modify or read any other registers, but cannot assume that a specific value would be on any one of them from previous instructions.
But Heule et. al. states that the current x86-64 design “contains 981 unique mnemonics and a total of 3,684 instruction variants” [2].
rax is the 64-bit, "long" size register. It was added in 2003 during the transition to 64-bit processors. eax is the 32-bit, "int" size register. It was added in 1985 during the transition to 32-bit processors with the 80386 CPU.
The x86 opcode bytes are 8-bit equivalents of iii field that we discussed in simplified encoding. This provides for up to 512 different instruction classes, although the x86 does not yet use them all.
x86 opcodes are 1 byte for most common instructions, especially instructions which have existed since 8086. Instructions added later (e.g. like bsf and movsx in 386) often use 2-byte opcodes with a 0f escape byte.
Since there is a byte immediate encoding for push and a one-byte pop for registers, this can be done in three bytes: 6a 01 58
, or push $1 / pop %rax
.
With any known pre-conditions, there are some tricks that are more efficient (in terms of speed) than the push imm8/pop rax 3-byte solution.
For speed mov eax, 1
has many advantages, because it doesn't have any input dependencies and it's only one instruction. Out-of-order execution can get started on it (and anything that depends on it) without waiting for other stuff. (See Agner Fog's guides and the x86 tag wiki).
Obviously many of these take advantage of the fact that writing a 32-bit register zeros the upper half, to avoid the unnecessary REX prefix of the OP's code. (Also note that xor rax,rax
is not special-cased as a zeroing idiom on Silvermont. It only recognizes xor-zeroing of 32-bit registers, like eax or r10d, not rax or r10.)
If you have a small known constant in any register to start with, you can use
lea eax, [rcx+1] ; 3 bytes: opcode + ModRM + disp8
disp8 can encode displacements from -128 to +127.
If you have an odd number in eax, and eax, 1
is also 3 bytes.
In 32-bit code, inc eax
only takes one byte, but those inc/dec opcodes were repurposed as REX prefixes for AMD64. So xor eax,eax
/ inc eax
is 4 bytes in x86-64 code, but only 3 in 32-bit code. Still, if saving 1 byte over a mov eax,1
is sufficient, and LEA or AND won't work, this is more efficient than push/pop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With