Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assembly: Y86 Stack and call, pushl/popl and ret instructions

Unless I copied it wrong, the code above was written in the blackboard in a class by a student with the help/corrections of the teacher:

int array[100], sum, i;

void ini() {
  for(i = 0; i < 100; i++)
    array[i] = i;
}

int main() {
  ini();

  sum = 0;

  for(i = 0; i < 100; i++)
    sum += array[i];
}

.pos 0
  irmovl Stack, %esp
  rrmovl Stack, %ebp

  jmp main

array:
.pos 430

sum: .long 0
i: .long 0

main:
  call ini                     //

  irmovl $0, %eax              // %eax = 0
  irmovl sum, %esi             // %esi = 0xsum
  rmmovl %eax, 0(%esi)         // 0(%esi) = %eax <=> 0(0xsum) = 0 [sum = 0]
  rmmovl %eax, 4(%esi)         // 4(%esi) = %eax <=> 4(0xsum) = 0 [i = 0]

compare:
  irmovl $100, %ebx            // %ebx = 100
  subl %eax, %ebx              // %ebx = %ebx - %eax <=> %ebx = 100 - i
  jle finish                   // Jumps to "finish" if SF=1 pr ZF=0

  mrmovl 0(%esi), %edx         // %edx = 0(%esi) <=> %edx = 0(0xsum) = sum
  addl %eax, %edx              // %edx = %edx + %eax <=> %edx = sum + i => sum
  rmmovl %edx, 0($esi)         // 0(%esi) = %edx <=> 0(0xsum) = sum

  irmovl $1, %ecx              // %ecx = 1
  addl %ecx, %eax              // %eax = %eax + %ecx <=> %eax = i + 1 => i
  rmmovl %eax, 4(%esi)         // 4($esi) = %eax <=> 4(0xsum) = i

  jmp compare                  // Jumps unconditionally to "compare"

ini:
  pushl %ebp                   //
  rrmovl %esp, %ebp            //
  pushl %ebx                   //
  pushl %eax                   //

  irmovl $0, %eax              // %eax = 0
  rmmovl %eax, -8(%ebp)        //

ini_compare:
  irmovl $100, %ecx            // %ecx = 100
  subl %eax, %ecx              // %ecx = %ecx - %eax <=> %ecx = 100 - i
  jle ini_finish               // Jumps to "ini_finish" if SF=1 pr ZF=0

  rrmovl %eax, %ebx            // %ebx = %eax <=> %ebx = i
  addl %eax, $ebx              // %ebx = %ebx + %eax <=> %ebx = i + i = 2i
  addl %ebx, %ebx              // %ebx = %ebx + %ebx <=> %ecx = 2i + 2i = 4i
  rmmovl %eax, array(%ebx)     // array(%ebx) = %eax <=> array(0x4i) = i

  irmovl %1, %ecx              // %ecx = 1
  addl %ecx, %eax              // %eax = %eax + %ecx <=> %eax = i + 1 => i
  rmmovl %eax, -8(%ebp)        //

  jmp ini_compare              // Jumps unconditionally to "ini_compare"

ini_finish:
  irmovl $4, %ebx              //
  addl %ebx, %esp              //
  popl %ebx                    //
  popl %ebp                    //

  ret                          //

.pos 600
  Stack .long 0

As you can see, there are a bunch of comments in all the instructions and I got (I think) most of them, what's confusing me is the call, pushl/popl and ret instructions. I don't quite understand them and I also don't understand what's happening to the stack and where all the records are pointing. Basically, the lines with comments (//) that don't have anything written on them.

It's really important I understand how all this works, hopefully, some of you can shed some light upon all this mess.

Some notes on my comments:

  • 0xsum: This doesn't mean the address is "sum", it would be impossible. It just a means to understand what I'm talking about without using the exact memory address.
  • [sum = 0]: This means that in our C code, the variable sum will be set as 0 at this point.
  • i + 1 => i: This means that we are incrementing the value of 'i' by one and that in the following line 'i' will actually represent that incremented value.
like image 380
rfgamaral Avatar asked Dec 14 '22 03:12

rfgamaral


1 Answers

Let's look at some of the code:

main:
  call ini

This will push the value of the instruction pointer to the stack (so that you can later return to this position in the code), and jump to the address of the ini label. The 'ret' instruction uses the value stored on the stack to return from the subroutine.

The following is the initialisation sequence of a subroutine. It saves the values of some registers on the stack and sets up a stack frame by copying the stack pointer (esp) to the base pointer register (ebp). If the subroutine has local variables, the stack pointer is decremented to make room for the variables on the stack, and the base pointer is used to access the local variables in the stack frame. In the example the only local variable is the (unused) return value.

The push instruction decrements the stack pointer (esp) with the data size of what's going to be pushed, then stores the value at that address. The pop instruction does the opposite, first getting the value, then increments the stack pointer. (Note that the stack grows downwards, so the stack pointer address gets lower when the stack grows.)

ini:
  pushl %ebp             // save ebp on the stack
  rrmovl %esp, %ebp      // ebp = esp (create stack frame)
  pushl %ebx             // save ebx on the stack
  pushl %eax             // push eax on the stack (only to decrement stack pointer)
  irmovl $0, %eax        // eax = 0
  rmmovl %eax, -8(%ebp)  // store eax at ebp-8 (clear return value)

The code follows a standard pattern, so it looks a bit awkward when there are no local variables, and there is an unused return value. If there are local variables a subtraction would be used to decrement the stack pointer instead of pushing eax.

The following is the exit sequence of a subroutine. It restores the stack to the position before the stack frame was created, then returns to the code that called the subroutine.

ini_finish:
   irmovl $4, %ebx   // ebx = 4
   addl %ebx, %esp   // esp += ebx (remove stack frame)
   popl %ebx         // restore ebx from stack
   popl %ebp         // restore ebp from stack
   ret               // get return address from stack and jump there

In response to your comments:

The ebx register is pushed and popped to preserve it's value. The compiler apparently always puts this code there, probably because the register is very commonly used, just not in this code. Likewise a stack frame is always created by copying esp to ebp even if it's not really needed.

The instruction that pushes eax is only there to decrement the stack pointer. It's done that way for small decrements as it's shorter and faster than subtracting the stack pointer. The space that it reserves is for the return value, again the compiler apparently always does this even if the return value is not used.

In your diagram the esp register is consistently pointing four bytes too high in memory. Remember that the stack pointer is decremented after pushing a value, so it will point to the value pushed, not to the next value. (The memory addresses are way off also, it's something like 0x600 rather than 0x20, as that's where the Stack label is declared.)

like image 184
Guffa Avatar answered Feb 19 '23 04:02

Guffa