Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does MOVSX assembly instruction work?

How does the assembly instruction MOVSX work in this following example:

MOVSX ECX,BYTE PTR DS:[EDX]

In this case, here are the state of the registers:

ECX = 0000000F   
EDX = 0012FD9F 

From what I thought, it takes last bytes of [EDX] = 9F, moves it to ECX and then sign extends it to match 16 bits = 0000009F. However, the actual result is 00000016. Can someone help explain where I'm wrong?

like image 532
Abundance Avatar asked Oct 21 '15 20:10

Abundance


2 Answers

That's partially correct. However:

BYTE PTR DS:[EDX] obtains the byte located at the address held in EDX. This byte is copied to ECX into the least significant byte and the rest is filled with the sign of the byte.

For your unexpected result, this means that at the memory address10x12FD9F the byte 0x16 is located.


Notes:

  • the Segment Override Prefix DS: isn't necessary here. [EDX] automatically refers to DS.

1 "memory address" refers to either virtual or physical memory here

like image 191
cadaniluk Avatar answered Sep 25 '22 13:09

cadaniluk


Many Intel/AMD x86 instructions are available in "modrm" format - they have two operands, one of which must be a register, the other of which may be a register, or a memory reference, whose address is determined by the modrm byte of the instruction encoding, and possibly by subsequent bytes of the instruction, such as the sib (scaled index byte), and the immediate constant / memory offset. And also by a possible segment prefix byte.

Usually these are reg,reg/mem instructions, of the form

   rsrcdst += rsrc
or
   rsrcdst += Memory[ ... addressessing mode ...]

But x86 assembly code does not have separate opcodes / instruction mnemonics for the reg,reg and reg,mem forms of these instructions. Whether an operand is a register or a memory location is indicated, in the assembler, by assembly syntax.

In this case, your assembly code is

MOVSX ECX,BYTE PTR DS:[EDX]

The instruction opcode is MOVSX.

The destination operand is register ECX.

The source operand is "BYTE PTR DS:[EDX]". That this is a memory reference is indicated by several things: (1) the square brackets around "[EDX]" - square brackets are a shorthand for Memory[...address...]. (2) the "DS:" prefix, which indicates that it is in the data segment. Register operands do not have such a segment prefix. (3) the "BYTE PTR" - which says "take the memory address specified by 'DS:[EDX]', and interpret it as referencing an 8-bit byte in memory".

I suspect that what you really want is

MOVSX ECX,DL

"DL" is a name for the low 8 bits of 32-bit register EDX. I.e. DL=EDX.bits[7:0]. Unfortunately, x86 assemblers usually don;t accept syntax like "EDX.bits[7:0]" (unless I wrote them), so you have to know the historical names of the sub registers:

AL = EAX.bits[7:0]
AH = EAX.bits[15:8]
AX = EAX.bits[15:0]
EAX = 32 bit register that "covers" all of the above

and so on: BL, CL, DL, DI, ...

like image 34
Krazy Glew Avatar answered Sep 23 '22 13:09

Krazy Glew