Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is on the stack before my program starts?

Tags:

stack

assembly

I am learning High Level Assembly Language at the moment, and am playing with the stack to better understand everything.

I note that in the following program, I can pop the contents of the stack, without ever having pushed anything onto it 37 times before the program crashes.

ike1: uns32 := 1;

begin test1;

while (ike1 < 38) do
pop(eax);
stdout.put(ike1, nl);
stdout.put("ESP: ", esp, nl);
stdout.put("EAX:", eax, nl, nl);
add(1, ike1);
endwhile;
end test1;

Each time the stack is popped into EAX, and the output of EAX shows random data each time.

I firstly don't understand how this is possible, as I thought every program was seggregated into its own private memory space?

In any event I am popping data of the stack...what would this be, and would it be affecting any other running programs?

My OS is Windows 7 64 bit.

like image 547
Jason Sill Avatar asked Aug 24 '11 16:08

Jason Sill


2 Answers

Before the execution of main() a bunch of other operations need to be done by the OS to properly setup the environment before control of the execution is handled to your application. So, most of what's on the stack at this point is garbage left from previous operations.

Right before main() is executed, you can expect to find argc and argv on the stack as well.

EDIT:

A comment from a user kinda challenged me to go through the process of debugging an assembly application in gdb and examining the stack to backup a statement I made on the original answer.

So please consider the following assembly code written in nasm:

section .data

  mymsg db "hello, world", 0xa  ; string with a carriage-return
  mylen equ $-mymsg             ; string length in bytes

section .text
global mystart                ; make the main function externally visible

mystart:
    ; prepare the arguments for syscall write()
    push dword mylen          ; msg length                           
    push dword mymsg          ; msg to write
    push dword 1              ; file descriptor number

    ; call write()
    mov eax, 0x4              ; 0x4 identifies syscall write()
    sub esp, 4                ; OS X (and BSD) syscalls needs "extra space" on stack
    int 0x80                  ; trigger the call

    ; clean up the stack
    add esp, 16               ; 3 args * 4 bytes/arg + 4 bytes extra space = 16 bytes

    ; prepare argument for syscall exit()
    push dword 0              ; exit status returned to the operating system

    ; call exit()
    mov eax, 0x1              ; 0x1 identifies syscall exit()
    sub esp, 4                ; OS X (and BSD) system calls needs "extra space" on stack
    int 0x80                  ; trigger the call

I compiled this on Mac OS X with:

nasm -f macho -o hello.o hello.nasm
ld -o hello -e mystart hello.o 

As you can probably tell by the source code, the start of the application is defined by mystart, and it doesn't take any parameters.

Now, let's make this investigation a little more exciting by opening this program in gdb:

gdb ./hello

After gdb has loaded, it's important for educational purposes to set a cmd line parameter for this application even though it wasn't written to accept any.

set args deadbeef

The application is still not running at this point. We need to set a breakpoint to the beginning of the main function so can inspect the stack to see what's going on before our application starts executing it's own code:

break mystart

Execute the command run on gdb to start the application and break the execution. Now we can inspect the stack with:

x/20xw $esp

outputs:

(gdb) x/20xw $esp
0xbffff8cc: 0x00002000  0x00000000  0x00000002  0xbffff96c
0xbffff8dc: 0xbffff98b  0x00000000  0xbffff994  0xbffff9b0
0xbffff8ec: 0xbffff9c1  0xbffff9d1  0xbffffa0b  0xbffffa40
0xbffff8fc: 0xbffffa5b  0xbffffa86  0xbffffa97  0xbffffaad
0xbffff90c: 0xbffffad8  0xbffffafa  0xbffffb06  0xbffffb28

Yes Sir, this command prints the contents of the stack. It tells gdb to show 20 words in hexadecimal format starting at the address stored by the $esp register.

Let's see, $esp actually points to 0xbffff8cc, ok but examining what's stored by this memory address reveals another address: 0x00002000. To what does it points to???

(gdb) x/20sw 0x00002000
0x2000 <mymsg>:  "hello, world\n"

Not a shocker, right?! So let's take a look at what some of the other addresses of the table are pointing to:

(gdb) x/1sw 0xbffff96c
0xbffff96c:  "/Developer/workspace/asm/hello"

Wow. That's actually the original application's name and path stored right there on the stack! Awesome, let's continue to the next interesting address of the table:

(gdb) x/1sw 0xbffff98b
0xbffff98b:  "deadbeef"

Jackpot! The cmd line argument we passed upon executing our application also got stored in the stack. So as I've stated before, among the garbage stored in the stack before your application executes, you can also find cmd line parameters that were used to execute the application even when the main() function of the application is void and doesn't take any parameters.

like image 113
karlphillip Avatar answered Nov 09 '22 17:11

karlphillip


The private memory space allocated to your program is not guaranteed to be cleared (i.e. 0x0000). Accessing unallocated memory is generally undefined behavior, which would explain the random data you are getting.

like image 42
WaelJ Avatar answered Nov 09 '22 17:11

WaelJ