I decided it would be fun to learn x86 assembly during the summer break. So I started with a very simple hello world program, borrowing on free examples gcc -S
could give me. I ended up with this:
HELLO:
.ascii "Hello, world!\12\0"
.text
.globl _main
_main:
pushl %ebp # 1. puts the base stack address on the stack
movl %esp, %ebp # 2. puts the base stack address in the stack address register
subl $20, %esp # 3. ???
pushl $HELLO # 4. push HELLO's address on the stack
call _puts # 5. call puts
xorl %eax, %eax # 6. zero %eax, probably not necessary since we didn't do anything with it
leave # 7. clean up
ret # 8. return
# PROFIT!
It compiles and even works! And I think I understand most of it.
Though, magic happens at step 3. Would I remove this line, my program would die between the call to puts
and the xor
from a misaligned stack error. And would I change $20
to another value, it'd crash too. So I came to the conclusion that this value is very
important.
Problem is, I don't know what it does and why it's needed.
Can anyone explain me? (I'm on Mac OS, would it ever matter.)
The addresses decrease as they move toward the top of the stack and increase as they move toward the bottom, so when a data element is pushed onto the stack, the stack pointer decrements to the next address below the current one, and when an element is removed, the pointer increments to the address of the next saved ...
The stack pointer always points to the item that is currently at the top of the stack. A push operation pre-decrements the stack pointer before storing an item on the stack. Hence the program initializes the stack pointer to point one item beyond the highest numbered element in the array that makes up the stack.
Stack registers in x86 In 8086, the main stack register is called stack pointer - SP. The stack segment register (SS) is usually used to store information about the memory segment that stores the call stack of currently executed program. SP points to current stack top.
The convention is that %rsp always points to the lowest (leftmost) stack address that is currently used. This means that when a function declares a new local variable, %rsp has to move down (left) and if a function returns, %rsp has to move up (right) and back to where it was when the function was originally called.
On x86 OSX, the stack needs to be 16 byte aligned for function calls, see ABI doc here. So, the explanation is
push stack pointer (#1) -4 strange increment (#3) -20 push argument (#4) -4 call pushes return address (#5) -4 total -32
To check, change line #3 from $20 to $4, which also works.
Also, Ignacio Vazquez-Abrams points out, #6 is not optional. Registers contain remnants of previous calculations so it has to explicitly be zeroed.
I recently learned (still learning) assembly, too. To save you the shock, 64bit calling conventions are MUCH different (parameters passed on the register). Found this very helpful for 64bit assembly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With