Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Direction/Sign Extend bit in the encoding of an x86 opcode

In the x86 instruction set the the bit at index 1 of an opcode can either be the direction bit which specifies what the destination and source operands are or it can be a sign extend bit.

e.g. for add

  • 00 /r ADD r/m8, r8 versus 02 /r ADD r8, r/m8
    That bit distinguishes r/m, reg vs. reg, r/m for the same mnemonic
  • 81 /0 id ADD r/m32, imm32 versus 83 /0 ib ADD r/m32, imm8
    full (bit 1 cleared) vs. sign-extended immediate (bit 1 set)

I'm wondering what's the easiest logical way to determine which of these cases it is. Is there a way to check other than checking the instruction opcodes and comparing them to find out which it is (for the sign extend or direction bit variants of the instructions)? There are also instructions that disregard this bit but since it's set to 0 then it doesn't really matter.

EDIT: Turns out that for write faults (which is what my code was intended for), reg->r/m is always the case because a r/m->reg instruction will never trigger a write fault. But any information would still be nice in case someone else is running into a similar issue.

like image 792
Jesus Ramos Avatar asked Aug 03 '11 06:08

Jesus Ramos


2 Answers

[Comment made into answer].

You obviously need a boolean formula over the stream of instruction bytes. I wouldn't know how to define that formula easily; the x86 has a really messy instruction set. I'd expect the key trick is to lookup the opcode byte in a table determined by the prefix bytes. If you are writing some kind of disassembler, I'd expect you to have such tables already anyway.

like image 121
Ira Baxter Avatar answered Sep 24 '22 03:09

Ira Baxter


The direction and sign bits are part of the flags register of the x86 processors. Since the lowest eight bits of the flags have the same layout as the flags of the 8080/8085/Z80 my guess is that the bit at index 1 is the signed bit. The position of the direction bit has not changed since it was introduced with the 8086/88 processors in the late 70s if my memory serves me.

The sign bit bit is modified as a result of an arithmetic operation and is a copy of the highest bit of the operation's result. INC and and DEC do not affect the sign bit.

The direction bit is manipulated using the cld/std instruction and controls whether the block instructions (cmps, ins, lods, movs, outs, scas and stos) post-increment/-decrement.

They may also be manipulated via the stack (though this is perhaps not meaningful with the sign bit)

pushf
and dword ptr [esp],SOME_MASK
popf

Using "and" is an example: or, xor and others may also be used.

If you manipulate the flag you may have to restore it to its previous value as some run-time libraries assume that it isn't modified.

like image 39
Olof Forshell Avatar answered Sep 21 '22 03:09

Olof Forshell