Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to disassemble movb instruction

I am writing a disassembler and I was reviewing the instruction format (and doing some disassembling by hand) and I ran into an instruction that I can't seem to be able to decode.

The output for that particular instruction (from objdump) is:

c6 05 14 a0 04 08 01    movb   $0x1,0x804a014

However, I dont understand how the instruction is decoded, since the opcode c6 is supposed to be MOV Eb Ib (Mod R/M to imm8).

Can somebody enlighten me as to how it is decoded?

Thanks!

like image 562
karurosu Avatar asked Sep 11 '12 06:09

karurosu


3 Answers

This is explained (in part) by Alex Frunze's answer, but his is a bit terse, so I will provide some explantation here:

  1. The opcode is c6/0, which indicates that there are 2 operands to the instruction. One is an r/m 8, which means an operand encoded in mod/rm byte, and an immediate operand. Both operands are 8 bits wide.
  2. The /0 in the opcode means that part of the opcode is encoded in the mod/rm byte. Bits 3-5 in the mod/rm byte are part of the opcode. When you have c6 followed by a mod/rm byte whose bits 3-5 have the value 0, you get an mov opcode.
  3. The value 5 (the byte that immedietly follows c6), corresponds to an r/m byte of 00 000 101 (in binary).
  4. The "last three" (bits 0-2) of the r/m byte correspond to the r/m field. An r/m value of 101 (5) means "use a displacement dword", so the next 4 bytes following the mod/rm byte form an immediate address.
  5. 14 a0 04 08 is the little endian encoding of 0x0804a014
  6. The last byte 1 is the immediate value to load into the address

I hope this helps.

like image 157
Scott Wisniewski Avatar answered Oct 23 '22 01:10

Scott Wisniewski


c6 - opcode (there's also a part of opcode in Mod/RM byte, in /digit(reg) field)
05 - Mod/RM byte (mod=00b, r/m=101b, /digit(reg)=0 - part of opcode)
14 a0 04 08 - disp32
01 - imm8

And it's a mov from Ib to Eb. You're probably confusing the AT&T syntax, in which objdump is showing the disassembly, with that of Intel/AMD documentation. The order of operands in AT&T syntax is the opposite of that in x86 CPU manuals.

like image 40
Alexey Frunze Avatar answered Oct 23 '22 03:10

Alexey Frunze


Well, moving to an immediate doesn't mean anything. What that instruction does is move a constant 1 into the memory byte located at 0x804a014. Something like the equivalent C code:

*(unsigned char *)0x804a014 = 1;

You've got opcode c6, as you know. You can look that up as part of the MOV instruction in Volume 2A of the docs.

The 05 is the ModR/M byte. You can decipher that using Table 2-2 of volume 2A, "32-Bit Addressing Forms With the ModR/M Byte". Look for 05 in the "Value of ModR/M Byte (in Hexadecimal)" part of the chart. Trace left from there, and you'll see that the effective address for this ModR/M value is given in 'disp32' form. The footnote there says "The disp32 nomenclature denotes a 32-bit displacement that follows the ModR/M byte". In this case that's the next four bytes of your instruction: 14 a0 04 08.

Finally, you have the 8-bit immediate 01, and the complete instruction is decoded.

like image 36
Carl Norum Avatar answered Oct 23 '22 03:10

Carl Norum