Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to get address of variable and dereference it in nasm x86 assembly?

in c language we use & to get the address of a variable and * to dereference the variable.


    int variable=10; 
    int *pointer;
    pointer = &variable;

How to do it in nasm x86 assembly language.
i read nasm manual and found that [ variable_address ] works like dereferencing.( i maybe wrong ).

section .data
variable db 'A'
section .text
global _start
_start:
mov eax , 4
mov ebx , 1
mov ecx , [variable]  
mov edx , 8
int 0x80
mov eax ,1
int 0x80



i executed this code it prints nothing. i can't understand what is wrong with my code. need your help to understand pointer and dereferencing in nasm x86.

like image 674
Naveen prakash Avatar asked Nov 28 '17 14:11

Naveen prakash


People also ask

How do you dereference an address in assembly?

To dereference a pointer in assembly, you write it in brackets, like "[eax]". This treats eax as a pointer, and accesses the memory it points to.

How do I dereference a register in assembly?

In assembly, a symbol is just a name for a an address. In your assembly source, L1 is a symbol defined elsewhere, which the assembler will resolve as an offset to memory. When dereferencing (using the [] notation), you can dereference a register (as in "mov al, [esi]") or an address (as in "mov al, [L1]").

Where are variables stored in assembly?

How are variables stored in assembly? The assembler associates an offset value for each variable name defined in the data segment. Each byte of character is stored as its ASCII value in hexadecimal. Each decimal value is automatically converted to its 16-bit binary equivalent and stored as a hexadecimal number.

Where are global variables stored x86?

Global variables are stored in the data section.


1 Answers

There are no variables in assembly. (*)

variable db 'A'

Does several things. It defines assembly-time symbol variable, which is like bookmark into memory, containing address of *here* in the time of compilation. It's same thing as doing label on empty line like:

variable:

The db 'A' directive is "define byte", and you give it single byte value to be defined, so it will produce single byte into resulting machine code with value 0x41 or 65 in decimal. That's the value of big letter A in ASCII encoding.

Then:

mov ecx , [variable]

Does load 4 bytes from memory cells at address variable, which means the low 8 bits ecx will contain the value 65, and the upper 24 bits will contain some junk which happened to reside in the following 3 bytes after the 'A' .. (would you use db 'ABCD', then the ecx would be equal to value 0x44434241 ('D' 'C' 'B' 'A' letters, "reversed" in bits due to little-endian encoding of dword values on x86).

But the sys_write expect the ecx to hold address of memory, where the content bytes are stored, so you need instead:

mov ecx, variable

That will in NASM load address of the data into ecx.

(in MASM/TASM this would instead assemble as mov ecx,[variable] and to get address you have to use mov ecx, OFFSET variable, in case you happen to find some MASM/TASM example, be aware of the syntax difference).


*) some more info about "no variables". Keep in mind in assembly you are on the machine level. On the machine level there is computer memory, which is addressable by bytes (on x86 platform! There are some platforms, where memory may be addressable by different size, they are not common, but in micro-controllers world you may find some). So by using some memory address, you can access some particular byte(s) in the physical memory chip (which particular physical place in memory chip is addressed depends on your platform, the modern OS will usually give user application virtual addressing space, translated to physical addresses by CPU on the fly, transparently, without bothering user code about that translation).

All the advanced logical concepts like "variables", "arrays", "strings", etc... are just bunch of byte values in memory, and all that logical meaning is given to the memory data by the instructions being executed. When you look at those data without the context of the instructions, they are just some byte values in memory, nothing more.

So if you are not precise with your code, and you access single-byte "variable" by instruction fetching dword, like you did in your mov ecx,[variable] example, there's nothing wrong about that from the machine point of view, and it will happily fetch 4 bytes of memory into ecx register, nor the NASM is bothered to report you, that you are probably out-of-bounds accessing memory beyond your original variable definition. This is sort of stupid behaviour, if you think in terms like "variables", and other high-level programming languages concepts. But assembly is not intended for such work, actually having the full control over machine is the main purpose of assembly, and if you want to fetch 4 bytes, you can, it's all up to programmer. It just requires tremendous amount of precision, and attention to detail, staying aware of your memory structures layout, and using correct instructions with desired memory operand sizes, like movzx ecx,byte [variable] to load only single byte from memory, and zero-extend that value into full 32b value in the target ecx register.

like image 162
Ped7g Avatar answered Sep 28 '22 19:09

Ped7g