I am doing a stack-smashing exercise for coursework, and I have already completed the assignment, but there is one aspect that I do not understand.
Here is the target program:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int bar(char *arg, char *out)
{
strcpy(out, arg);
return 0;
}
void foo(char *argv[])
{
char buf[256];
bar(argv[1], buf);
}
int main(int argc, char *argv[])
{
if (argc != 2)
{
fprintf(stderr, "target1: argc != 2\n");
exit(EXIT_FAILURE);
}
foo(argv);
return 0;
}
Here are the commands used to compile it, on an x86
virtual machine running Ubuntu 12.04
, with ASLR
disabled.
gcc -ggdb -m32 -g -std=c99 -D_GNU_SOURCE -fno-stack-protector -m32 target1.c -o target1
execstack -s target1
When I look at the memory of this program on the stack, I see that buf
has the address 0xbffffc40
. Moreover, the saved frame pointer is stored at 0xbffffd48
, and the return address is stored at 0xbffffd4c
.
These specific addresses are not relevant, but I observe that even though buf
only has length 256
, the distance 0xbffffd48 - 0xbffffc40 = 264
. Symbolically, this computation is $fp - buf
.
Why are there 8
extra bytes between the end of buf
and the stored frame pointer on the stack?
Here is some disassembly of the function foo
. I have already examined it, but I did not see any obvious usage of that memory region, unless it was implicit (ie a side effect of some instruction).
0x080484ab <+0>: push %ebp
0x080484ac <+1>: mov %esp,%ebp
0x080484ae <+3>: sub $0x118,%esp
0x080484b4 <+9>: mov 0x8(%ebp),%eax
0x080484b7 <+12>: add $0x4,%eax
0x080484ba <+15>: mov (%eax),%eax
0x080484bc <+17>: lea -0x108(%ebp),%edx
0x080484c2 <+23>: mov %edx,0x4(%esp)
0x080484c6 <+27>: mov %eax,(%esp)
0x080484c9 <+30>: call 0x804848c <bar>
0x080484ce <+35>: leave
0x080484cf <+36>: ret
Because of this reason we see the size of a pointer to be 4 bytes in 32 bit machine and 8 bytes in a 64 bit machine.
A frame pointer (the ebp register on intel x86 architectures, rbp on 64-bit architectures) contains the base address of the function's frame. The code to access local variables within a function is generated in terms of offsets to the frame pointer.
There is still some reason to save EBP on entry to a procedure/function though: unless the stack gets corrupted, it's really easy to walk. EBP points to the previous value of EBP, which points to the previous value, and so on all the way down the stack.
A stack frame is a memory management technique used in some programming languages for generating and eliminating temporary variables. In other words, it can be considered the collection of all information on the stack pertaining to a subprogram call.
Basile Starynkevitch gets the prize for mentioning alignment
.
It turns out that gcc 4.7.2
defaults to aligning the frame boundary to a 4-word boundary. On 32-bit emulated hardware, that is 16 bytes. Since the saved frame pointer and the saved instruction pointer together only take up 8 bytes, the compiler put in another 8 bytes after the end of buf
to align the top of the stack frame to a 16 byte boundary.
Using the following additional compiler flag, the 8 bytes disappears, because the 8 bytes is enough to align to a 2-word boundary.
-mpreferred-stack-boundary=2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With