Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pointer Deferencing in x86 Assembly Code

I'm reading my textbook and it has code for a swap function:

In C:

int exchange(int *xp, int y) {
 int x = *xp;
 *xp = y;
 return x; 
}

In x86 Assembly with annotations:

// xp is at %ebp + 8, y at %ebp + 12
movl 8(%ebp), %edx      // get xp
movl (%edx), %eax       // get x at xp
movl 12(%ebp), %ecx     // get y
movl %ecx, (%edx)       // store y at xp

So from my understanding, if the int* xp pointed to an int I at address A, then the first line of assembly code stores A at %edx. Then it gets dereferenced in the second line and stored at %eax.

If this is true, I'm wondering why line 1's "8(%ebp)" doesn't dereference the pointer, storing the int I in %edx instead of the address A? Isn't that what parentheses do in assembly?

Or does that mean that when pointers are pushed onto the stack, the address of the pointer is pushed on rather than the value it holds so 8(%ebp) technically holds &xp?

Just wanted to clarify if my understanding was correct.

like image 361
Chang Liu Avatar asked Apr 23 '15 04:04

Chang Liu


People also ask

How do I dereference a register in assembly?

In assembly, a symbol is just a name for a an address. In your assembly source, L1 is a symbol defined elsewhere, which the assembler will resolve as an offset to memory. When dereferencing (using the [] notation), you can dereference a register (as in "mov al, [esi]") or an address (as in "mov al, [L1]").

What is the instruction pointer for in x86?

The x86 processor maintains an instruction pointer (EIP) register that is a 32-bit value indicating the location in memory where the current instruction starts. Normally, it increments to point to the next instruction in memory begins after execution an instruction.

What is pointer in assembly language?

Pointers in assembly language have much simpler syntax: BYTE [rax] means go out to memory and grab one byte at the address stored in register rax. That address is always measured in bytes, and is called a "pointer", but it's just a number in rax.

What is RBP and RSP in assembly?

%rbp is the base pointer, which points to the base of the current stack frame, and %rsp is the stack pointer, which points to the top of the current stack frame. %rbp always has a higher value than %rsp because the stack starts at a high memory address and grows downwards.


2 Answers

xp is a pointer. It has a four byte value. That value is pushed onto the stack by the calling function. The function prologue, which you haven't shown sets up a base pointer in ebp. The value of xp is stored at offset 8 relative to that base pointer.

So the first line of code dereferences the base pointer, as indicated by the parentheses, with an offset of 8 to retrieve the value of xp (which is an address that points to an int) and puts it into edx.

The second line of code uses the address in edx to retrieve the value of the int, and puts that value into eax. Note that the function return value will be the value in eax.

The third line dereferences the base pointer, with an offset of 12, to get the value of y.

The fourth line uses the address in edx to place y at the location that xp points to.

like image 52
user3386109 Avatar answered Oct 13 '22 23:10

user3386109


%bp is the stack base pointer which must be deferenced before I can access anything on the stack. So movl8(%bp),%edx` fetches the value which sits on offset 8 in the current stack frame.

This value is a pointer, so we have to dereference it in order to access its contents, no matter if for reading or writing.

OTOH, y is an int, so to get it is just movl 12(%ebp), %ecx and no further actoin required.

So movl %ecx, (%edx) is exactly the right thing: put the value which is stored in ecx to the memory pointed to by edx.

like image 22
glglgl Avatar answered Oct 13 '22 22:10

glglgl