Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between MOV and LEA in terms of retrieving an address

Tags:

x86

assembly

nasm

What exactly is the difference between mov and lea when I use them to get an address?

Let's say if I have a program printing out a character string starting from its 5th character whose code is shown below:

section .text
    global _start
_start:
    mov edx, 0x06  ;the length of msg from its 5th char to the last is 6.
    lea ecx, [msg + 4]
    mov ebx, 1
    mov eax, 4
    int 0x80

section .data
msg db '1234567890'

Then, if I swap lea ecx, [msg + 4] for mov ecx, msg + 4, would it run differently?

I tried both and the outputs appeared to be the same. However, I read from this link, What's the purpose of the LEA instruction?, in the comment section of this first answer, it seemed that someone claimed that something like mov ecx, msg + 4 was invalid, but I failed to see it. Can someone help me to understand this? Thanks in advance!

like image 993
glenjoker Avatar asked Feb 18 '16 07:02

glenjoker


1 Answers

When the absolute address is a link-time constant, mov r32, imm32 and lea r32, [addr] will both get the job done. The imm32 can be any valid NASM expression. In this case msg + 4 is a link-time constant. The linker will find the final address of msg, add 4 to it (because the placeholder in the .o had the +4 as the displacement). That final value replaces the 4B placeholder when copying the bytes from the .o to the linker output.

Exactly the same thing happens to the 4B displacement in lea's effective address.


mov has a slightly shorter encoding, and can run on more execution ports. Use mov reg, imm unless you can take advantage of lea to do some useful math with registers at the same time. (for example: lea ecx, [msg + 4 + eax*4 + edx])

In 64-bit mode, where RIP-relative addressing is possible, using LEA lets you make efficient position-independent code (that doesn't need to be modified if mapped to a different virtual address). There's no way to achieve this functionality with mov. See How to load address of function or label into register in GNU Assembler (also covers NASM) and Referencing the contents of a memory location. (x86 addressing modes)

Also see the x86 tag wiki for many good links.


Also note that you can use a symbolic constant for the size. You can also format and comment your code better. (indenting the operands looks less messy in code that has some instructions with longer mnemonics).

section .text
    global _start
_start:
    mov    edx, msgsize - 4
    mov    ecx, msg + 4     ; In MASM syntax, this would be mov ecx, OFFSET msg + 4
    mov    ebx, 1       ; stdout
    mov    eax, 4       ; NR_write
    int    0x80         ; write(1, msg+4, msgsize-4)

    mov    eax, 1       ; NR_exit
    xor    ecx, ecx
    int    0x80         ; exit(0)
    ;; otherwise execution falls through into non-code and segfaults

section .rodata
msg db '1234567890'     ; note, not null-terminated, and no newline
msgsize equ $-msg       ; current position - start of message
like image 152
Peter Cordes Avatar answered Oct 04 '22 00:10

Peter Cordes