Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shortest Intel x86-64 opcode for rax=1?

What would be the shortest Intel x86-64 opcode for setting rax to 1?

I tried xor rax,rax and inc al (in NASM syntax); which gives the 5-byte opcode 48 31 c0 fe c0. Would it be possible to achieve the same result in 4 bytes?

You can modify or read any other registers, but cannot assume that a specific value would be on any one of them from previous instructions.

like image 959
kubuzetto Avatar asked Nov 20 '15 11:11

kubuzetto


People also ask

How many opcodes does x86 have?

But Heule et. al. states that the current x86-64 design “contains 981 unique mnemonics and a total of 3,684 instruction variants” [2].

Is RAX a 64-bit register?

rax is the 64-bit, "long" size register. It was added in 2003 during the transition to 64-bit processors. eax is the 32-bit, "int" size register. It was added in 1985 during the transition to 32-bit processors with the 80386 CPU.

What is opcode in x86?

The x86 opcode bytes are 8-bit equivalents of iii field that we discussed in simplified encoding. This provides for up to 512 different instruction classes, although the x86 does not yet use them all.

What size in bytes are opcodes on an x86 processor?

x86 opcodes are 1 byte for most common instructions, especially instructions which have existed since 8086. Instructions added later (e.g. like bsf and movsx in 386) often use 2-byte opcodes with a 0f escape byte.


2 Answers

Since there is a byte immediate encoding for push and a one-byte pop for registers, this can be done in three bytes: 6a 01 58, or push $1 / pop %rax.

like image 187
gsg Avatar answered Oct 16 '22 12:10

gsg


With any known pre-conditions, there are some tricks that are more efficient (in terms of speed) than the push imm8/pop rax 3-byte solution.

For speed mov eax, 1 has many advantages, because it doesn't have any input dependencies and it's only one instruction. Out-of-order execution can get started on it (and anything that depends on it) without waiting for other stuff. (See Agner Fog's guides and the x86 tag wiki).

Obviously many of these take advantage of the fact that writing a 32-bit register zeros the upper half, to avoid the unnecessary REX prefix of the OP's code. (Also note that xor rax,rax is not special-cased as a zeroing idiom on Silvermont. It only recognizes xor-zeroing of 32-bit registers, like eax or r10d, not rax or r10.)


If you have a small known constant in any register to start with, you can use

lea   eax, [rcx+1]    ; 3 bytes: opcode + ModRM + disp8

disp8 can encode displacements from -128 to +127.


If you have an odd number in eax, and eax, 1 is also 3 bytes.


In 32-bit code, inc eax only takes one byte, but those inc/dec opcodes were repurposed as REX prefixes for AMD64. So xor eax,eax / inc eax is 4 bytes in x86-64 code, but only 3 in 32-bit code. Still, if saving 1 byte over a mov eax,1 is sufficient, and LEA or AND won't work, this is more efficient than push/pop.

like image 22
Peter Cordes Avatar answered Oct 16 '22 11:10

Peter Cordes