I'm learning Assembler and getting to the point where I actually have no clue about the difference between [variable]
and variable
. As the tutorials say, both are pointers, so what is the point of this? And why do I have to use a type Identifier
before []
?
my assembler: nasm x86_64 running on Linux--> Ubuntu
In x86 Intel syntax [expression]
means content of memory at address expression
.
(Except in MASM when expression
is a numeric literal or equ
constant with no registers, then it's still an immediate)
expression
without brackets depends on Assembler you are using.
NASM-style (NASM, YASM):
mov eax,variable ; moves address of variable into eax
lea eax,[variable] ; equivalent to the previous one (LEA is exception)
mov eax,[variable] ; loads content of variable into eax
MASM-style (also TASM and even GCC/GAS .intel_syntax noprefix
):
mov eax,variable ; load content of variable (for lazy programmers)
mov eax,OFFSET variable ; address of variable
lea eax,[variable] ; address of variable
mov eax,[variable] ; content of variable
GAS (AT&T syntax): It's not Intel syntax, see the AT&T tag wiki. GAS also uses different directives (like .byte
instead of db
), even in .intel_syntax
mode.
In all cases the variable
is alias for symbol marking particular place in memory, where the label appeared. So:
variable1 db 41
variable2 dw 41
label1:
produces three symbols into symbol table, variable1
, variable2
and label1
.
When you use any of them in the code, like mov eax,<symbol>
, it has no information whether it was defined by db
or dw
or as label, so it will not give you any warning when you do mov [variable1],ebx
(overwriting 3 bytes beyond the defined first byte).
It's simply just an address in memory.
(Except in MASM, where the db or dd after a label in a data section does associate a size with it that "variable name".)
Type identifier is only required in most of the assemblers when the type can't be deduced from the instruction operands itself.
mov [ebx],eax ; obviously 32 bits are stored, because eax is 32b wide
mov [ebx],1 ; ERROR: how "wide" is that immediate value 1?
mov [ebx],WORD 1 ; NASM syntax (16 bit value, storing two bytes)
mov WORD [ebx],1 ; NASM syntax (16 bit value, storing two bytes)
mov WORD PTR [ebx],1 ; MASM/TASM syntax
A little example using registers and pointers:
mov eax, 10
means: move into the register EAX the value 10. In this case, EAX is used just to store something. What EAX contains doesn't matter at all to the programmer, since it will be erased anyway.
mov [eax], 10
means: move the value 10 into the address stored in EAX register. In this case, the value stored in EAX matters a lot to us, since it's a pointer, which means that we have to go EAX register and see what is contains, then we use this value as the address to access.
Two steps are then needed when you use a pointer:
Go to EAX, and see what value it contains (for example EAX = 0xBABA) ;
Go to the address pointed by EAX (in our case 0xBABA) and write 10 in it.
Of course, pointers are not necessarily used with registers, this little example is just to explain how it works.
Since you already know C++, I'm going to answer by showing you what the C equivalents of these expressions are.
When you write
[variable]
in assembly, it's equivalent to
*variable
in C. That is, treat variable
as a pointer and dereference that pointer — get the value the the pointer is pointing to.
Similarly, the 'type identifiers' are like casting the pointer to a different type:
ASM:
dword ptr [variable]
C:
*((uint32_t*) variable)
ASM:
word ptr [variable]
C:
*((uint16_t*) variable)
I hope this helps you understand the meaning of these expressions.
(this section refers to an addendum that has since been deleted from the original question)
I'm not entirely sure what problem you're experiencing with 'conversion to ascii', but I suspect you're just confused by how it's visually rendered in output or something.
For example if you have code like this:
myInteger db 41
mov AL, byte ptr [myInteger]
the mov
will copy the value 41
from memory into the AL
register. The number 41
happens to be the ascii representation for the )
character, but this doesn't change anything. Whether the value is interpreted as an ascii character or as an integer is up to you, because they are the same value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With