Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does machine code access parameters to a subroutine call?

When running a program you can pass paramters, e.g.

$ myProgram par1 par2 par3

In C you can access these paramters by looking at argv,

int main (int argc, char *argv[]) 
{
     char* aParameter = argv[1];  // Not sure if this is 100% right but you get the idea...
}

How would this translate in assembly / x86 machine code? How would you access the variables given to you? How would the system give you these variables?

Im very new to assembly, it seams you can only access registers and absolute addresses. I am puzzled how you could access parameters. Does the system preload the parameters into a special register for you?

like image 257
Robert Avatar asked Dec 05 '11 23:12

Robert


2 Answers

Function calls

Parameters are usually passed on the stack, which is a part of memory that is pointed to by esp. The operating system is responsible for reserving some memory for the stack and then setting up esp properly before passing control to your program.

A normal function call could look something like this:

main:
  push 456
  push 123
  call MyFunction
  add esp, 8
  ret

MyFunction:
   ; [esp+0] will hold the return address
   ; [esp+4] will hold the first parameter (123)
   ; [esp+8] will hold the second parameter (456)
   ;
   ; To return from here, we usually execute a 'ret' instruction,
   ; which is actually equivalent to:
   ;
   ; add esp, 4
   ; jmp [esp-4]

   ret

There are different responsibilities split between the calling function and the function that is being called, with regards to how they promise to preserve registers. These rules are referred to as calling conventions.

The example above uses the cdecl calling convention, which means that parameters are pushed onto the stack in reverse order, and the calling function is responsible for restoring esp back to where it pointed before those parameters were pushed to the stack. That's what add esp, 8 does.

Main function

Typically, you write a main function in assembly and assemble it into an object file. You then pass this object file to a linker to produce an executable.

The linker is responsible for producing startup code that sets up the stack properly before control is passed to your main function, so that your function can act as if it were called with two arguments (argc/argv). That is, your main function is not the real entry point, but the startup code jumps there after it has set up the argc/argv arguments.

Startup code

So how does this "startup code" look? The linker will produce it for us, but it's always interesting to know how stuff works.

This is platform specific, but I'll describe a typical case on Linux. This article, while dated, explains the stack layout on Linux when an i386 program starts. The stack will look like this:

esp+00h: argc
esp+04h: argv[0]
esp+08h: argv[1]
esp+1Ch: argv[2]
...

So the startup code can get the argc/argv values from the stack and then call main(...) with two parameters:

; This is very incomplete startup code, but it illustrates the point

mov eax, [esp]        ; eax = argc
lea edx, [esp+0x04]   ; edx = argv

; push argv, and argc onto the stack (note the reverse order)
push edx
push eax
call main
;
; When main returns, use its return value (eax)
; to set an exit status
;
...
like image 196
Martin Avatar answered Sep 27 '22 22:09

Martin


The C-runtime is doing some work for you here - it fetches the program arguments from the OS and parses them if necessary before involking your main function. In asemmbler, you'll have to fetch the command arguments and parse them yourself. How you get the program arguments is OS specific.

like image 34
mdma Avatar answered Sep 28 '22 00:09

mdma