Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assembly difference between [var], and var

Tags:

x86

assembly

nasm

I'm learning Assembler and getting to the point where I actually have no clue about the difference between [variable] and variable. As the tutorials say, both are pointers, so what is the point of this? And why do I have to use a type Identifier before []? my assembler: nasm x86_64 running on Linux--> Ubuntu

like image 1000
TheFrenchPlays Hd Micraftn Avatar asked Sep 13 '16 15:09

TheFrenchPlays Hd Micraftn


3 Answers

In x86 Intel syntax [expression] means content of memory at address expression.
(Except in MASM when expression is a numeric literal or equ constant with no registers, then it's still an immediate)


expression without brackets depends on Assembler you are using.

NASM-style (NASM, YASM):

mov eax,variable      ; moves address of variable into eax
lea eax,[variable]    ; equivalent to the previous one (LEA is exception)
mov eax,[variable]    ; loads content of variable into eax

MASM-style (also TASM and even GCC/GAS .intel_syntax noprefix):

mov eax,variable      ; load content of variable (for lazy programmers)
mov eax,OFFSET variable   ; address of variable
lea eax,[variable]    ; address of variable
mov eax,[variable]    ; content of variable

GAS (AT&T syntax): It's not Intel syntax, see the AT&T tag wiki. GAS also uses different directives (like .byte instead of db), even in .intel_syntax mode.


In all cases the variable is alias for symbol marking particular place in memory, where the label appeared. So:

variable1  db  41
variable2  dw  41
label1:

produces three symbols into symbol table, variable1, variable2 and label1.

When you use any of them in the code, like mov eax,<symbol>, it has no information whether it was defined by db or dw or as label, so it will not give you any warning when you do mov [variable1],ebx (overwriting 3 bytes beyond the defined first byte).

It's simply just an address in memory.

(Except in MASM, where the db or dd after a label in a data section does associate a size with it that "variable name".)


Type identifier is only required in most of the assemblers when the type can't be deduced from the instruction operands itself.

mov [ebx],eax ; obviously 32 bits are stored, because eax is 32b wide
mov [ebx],1   ; ERROR: how "wide" is that immediate value 1?
mov [ebx],WORD 1 ; NASM syntax (16 bit value, storing two bytes)
mov WORD [ebx],1 ; NASM syntax (16 bit value, storing two bytes)
mov WORD PTR [ebx],1 ; MASM/TASM syntax
like image 152
Ped7g Avatar answered Oct 06 '22 00:10

Ped7g


A little example using registers and pointers:

mov eax, 10 means: move into the register EAX the value 10. In this case, EAX is used just to store something. What EAX contains doesn't matter at all to the programmer, since it will be erased anyway.

mov [eax], 10 means: move the value 10 into the address stored in EAX register. In this case, the value stored in EAX matters a lot to us, since it's a pointer, which means that we have to go EAX register and see what is contains, then we use this value as the address to access.

Two steps are then needed when you use a pointer:

  1. Go to EAX, and see what value it contains (for example EAX = 0xBABA) ;

  2. Go to the address pointed by EAX (in our case 0xBABA) and write 10 in it.

Of course, pointers are not necessarily used with registers, this little example is just to explain how it works.

like image 30
Ryan B. Avatar answered Oct 06 '22 00:10

Ryan B.


Since you already know C++, I'm going to answer by showing you what the C equivalents of these expressions are.

When you write

[variable]

in assembly, it's equivalent to

*variable

in C. That is, treat variable as a pointer and dereference that pointer — get the value the the pointer is pointing to.

Similarly, the 'type identifiers' are like casting the pointer to a different type:

ASM:
    dword ptr [variable]
C:
    *((uint32_t*) variable)

ASM:
    word ptr [variable]
C:
    *((uint16_t*) variable)

I hope this helps you understand the meaning of these expressions.


(this section refers to an addendum that has since been deleted from the original question)

I'm not entirely sure what problem you're experiencing with 'conversion to ascii', but I suspect you're just confused by how it's visually rendered in output or something.

For example if you have code like this:

myInteger db 41
mov AL, byte ptr [myInteger]

the mov will copy the value 41 from memory into the AL register. The number 41 happens to be the ascii representation for the ) character, but this doesn't change anything. Whether the value is interpreted as an ascii character or as an integer is up to you, because they are the same value.

like image 41
Cauterite Avatar answered Oct 05 '22 23:10

Cauterite