Why are there 8 bytes between the end of a buffer and the saved frame pointer?

I am doing a stack-smashing exercise for coursework, and I have already completed the assignment, but there is one aspect that I do not understand.

Here is the target program:

#include <stdio.h>                                             
#include <stdlib.h>                                            
#include <string.h>                                            

int bar(char *arg, char *out)                                  
  strcpy(out, arg);                                            
  return 0;                                                    

void foo(char *argv[])                                         
  char buf[256];                                               
  bar(argv[1], buf);                                           

int main(int argc, char *argv[])                               
  if (argc != 2)                                               
      fprintf(stderr, "target1: argc != 2\n");                 
  return 0;                                                    

Here are the commands used to compile it, on an x86 virtual machine running Ubuntu 12.04, with ASLR disabled.

gcc -ggdb -m32 -g -std=c99 -D_GNU_SOURCE -fno-stack-protector  -m32  target1.c   -o target1
execstack -s target1

When I look at the memory of this program on the stack, I see that buf has the address 0xbffffc40. Moreover, the saved frame pointer is stored at 0xbffffd48, and the return address is stored at 0xbffffd4c.

These specific addresses are not relevant, but I observe that even though buf only has length 256, the distance 0xbffffd48 - 0xbffffc40 = 264. Symbolically, this computation is $fp - buf.

Why are there 8 extra bytes between the end of buf and the stored frame pointer on the stack?

Here is some disassembly of the function foo. I have already examined it, but I did not see any obvious usage of that memory region, unless it was implicit (ie a side effect of some instruction).

   0x080484ab <+0>:     push   %ebp                    
   0x080484ac <+1>:     mov    %esp,%ebp               
   0x080484ae <+3>:     sub    $0x118,%esp             
   0x080484b4 <+9>:     mov    0x8(%ebp),%eax          
   0x080484b7 <+12>:    add    $0x4,%eax               
   0x080484ba <+15>:    mov    (%eax),%eax             
   0x080484bc <+17>:    lea    -0x108(%ebp),%edx       
   0x080484c2 <+23>:    mov    %edx,0x4(%esp)          
   0x080484c6 <+27>:    mov    %eax,(%esp)             
   0x080484c9 <+30>:    call   0x804848c <bar>         
   0x080484ce <+35>:    leave                          
   0x080484cf <+36>:    ret                            
1 Answers

Basile Starynkevitch gets the prize for mentioning alignment.

It turns out that gcc 4.7.2 defaults to aligning the frame boundary to a 4-word boundary. On 32-bit emulated hardware, that is 16 bytes. Since the saved frame pointer and the saved instruction pointer together only take up 8 bytes, the compiler put in another 8 bytes after the end of buf to align the top of the stack frame to a 16 byte boundary.

Using the following additional compiler flag, the 8 bytes disappears, because the 8 bytes is enough to align to a 2-word boundary.

