I have been a high-level coder, and architectures are pretty new to me, so I decided to read the tutorial on Assembly here: http://en.wikibooks.org/wiki/X86_Assembly/Print_Version Far down the tutorial, instructions on how to convert the Hello World! program <pre class="prettyprint"><code>#include <stdio.h> int main(void) { printf("Hello, world!\n"); return 0; } </code></pre> into equivalent assembly code was given and the following was generated: <pre class="prettyprint"><code> .text LC0: .ascii "Hello, world!\12\0" .globl _main _main: pushl %ebp movl %esp, %ebp subl $8, %esp andl $-16, %esp movl $0, %eax movl %eax, -4(%ebp) movl -4(%ebp), %eax call __alloca call ___main movl $LC0, (%esp) call _printf movl $0, %eax leave ret </code></pre> For one of the lines, <pre class="prettyprint"><code>andl $-16, %esp </code></pre> the explanation was: <blockquote> This code "and"s ESP with 0xFFFFFFF0, aligning the stack with the next lowest 16-byte boundary. An examination of Mingw's source code reveals that this may be for SIMD instructions appearing in the "_main" routine, which operate only on aligned addresses. Since our routine doesn't contain SIMD instructions, this line is unnecessary. </blockquote> I do not understand this point. Can someone give me an explanation of what it means to align the stack with the next 16-byte boundary and why it is required? And how is the <code>andl</code> achieving this?

Assume the stack looks like this on entry to <code>_main</code> (the address of the stack pointer is just an example): <pre class="prettyprint"><code>| existing | | stack content | +-----------------+ <--- 0xbfff1230 </code></pre> Push <code>%ebp</code>, and subtract 8 from <code>%esp</code> to reserve some space for local variables: <pre class="prettyprint"><code>| existing | | stack content | +-----------------+ <--- 0xbfff1230 | %ebp | +-----------------+ <--- 0xbfff122c : reserved : : space : +-----------------+ <--- 0xbfff1224 </code></pre> Now, the <code>andl</code> instruction zeroes the low 4 bits of <code>%esp</code>, which may decrease it; in this particular example, it has the effect of reserving an additional 4 bytes: <pre class="prettyprint"><code>| existing | | stack content | +-----------------+ <--- 0xbfff1230 | %ebp | +-----------------+ <--- 0xbfff122c : reserved : : space : + - - - - - - - - + <--- 0xbfff1224 : extra space : +-----------------+ <--- 0xbfff1220 </code></pre> The point of this is that there are some "SIMD" (Single Instruction, Multiple Data) instructions (also known in x86-land as "SSE" for "Streaming SIMD Extensions") which can perform parallel operations on multiple words in memory, but require those multiple words to be a block starting at an address which is a multiple of 16 bytes. In general, the compiler can't assume that particular offsets from <code>%esp</code> will result in a suitable address (because the state of <code>%esp</code> on entry to the function depends on the calling code). But, by deliberately aligning the stack pointer in this way, the compiler knows that adding any multiple of 16 bytes to the stack pointer will result in a 16-byte aligned address, which is safe for use with these SIMD instructions.

What does it mean to align the stack?

I have been a high-level coder, and architectures are pretty new to me, so I decided to read the tutorial on Assembly here:

http://en.wikibooks.org/wiki/X86_Assembly/Print_Version

Far down the tutorial, instructions on how to convert the Hello World! program

#include <stdio.h>  int main(void) {     printf("Hello, world!\n");     return 0; }

into equivalent assembly code was given and the following was generated:

        .text LC0:         .ascii "Hello, world!\12\0" .globl _main _main:         pushl   %ebp         movl    %esp, %ebp         subl    $8, %esp         andl    $-16, %esp         movl    $0, %eax         movl    %eax, -4(%ebp)         movl    -4(%ebp), %eax         call    __alloca         call    ___main         movl    $LC0, (%esp)         call    _printf         movl    $0, %eax         leave         ret

For one of the lines,

andl    $-16, %esp

the explanation was:

This code "and"s ESP with 0xFFFFFFF0, aligning the stack with the next lowest 16-byte boundary. An examination of Mingw's source code reveals that this may be for SIMD instructions appearing in the "_main" routine, which operate only on aligned addresses. Since our routine doesn't contain SIMD instructions, this line is unnecessary.

I do not understand this point. Can someone give me an explanation of what it means to align the stack with the next 16-byte boundary and why it is required? And how is the andl achieving this?

What does stack alignment mean?

IIRC, stack alignment is when variables are placed on the stack "aligned" to a particular number of bytes. So if you are using a 16 bit stack alignment, each variable on the stack is going to start from a byte that is a multiple of 2 bytes from the current stack pointer within a function.

What is address alignment?

An aligned access is an operation where a word-aligned address is used for a word, dual word, or multiple word access, or where a halfword-aligned address is used for a halfword access. Byte accesses are always aligned.

What is a byte boundary?

Certain SIMD instructions, which perform the same instruction on multiple data, require that the memory address of this data is aligned to a certain byte boundary. This effectively means that the address of the memory your data resides in needs to be divisible by the number of bytes required by the instruction.

Assume the stack looks like this on entry to _main (the address of the stack pointer is just an example):

|    existing     | |  stack content  | +-----------------+  <--- 0xbfff1230

Push %ebp, and subtract 8 from %esp to reserve some space for local variables:

|    existing     | |  stack content  | +-----------------+  <--- 0xbfff1230 |      %ebp       | +-----------------+  <--- 0xbfff122c :    reserved     : :     space       : +-----------------+  <--- 0xbfff1224

Now, the andl instruction zeroes the low 4 bits of %esp, which may decrease it; in this particular example, it has the effect of reserving an additional 4 bytes:

|    existing     | |  stack content  | +-----------------+  <--- 0xbfff1230 |      %ebp       | +-----------------+  <--- 0xbfff122c :    reserved     : :     space       : + - - - - - - - - +  <--- 0xbfff1224 :   extra space   : +-----------------+  <--- 0xbfff1220

The point of this is that there are some "SIMD" (Single Instruction, Multiple Data) instructions (also known in x86-land as "SSE" for "Streaming SIMD Extensions") which can perform parallel operations on multiple words in memory, but require those multiple words to be a block starting at an address which is a multiple of 16 bytes.

In general, the compiler can't assume that particular offsets from %esp will result in a suitable address (because the state of %esp on entry to the function depends on the calling code). But, by deliberately aligning the stack pointer in this way, the compiler knows that adding any multiple of 16 bytes to the stack pointer will result in a 16-byte aligned address, which is safe for use with these SIMD instructions.

What does it mean to align the stack?

Tags:

c

x86

gcc

assembly

memory-alignment

Legend

People also ask

1 Answers

Matthew Slattery

Recent Activity

Donate For Us

What does it mean to align the stack?

Tags:

c

x86

gcc

assembly

memory-alignment

Legend

People also ask

1 Answers

Matthew Slattery

Related questions

Recent Activity

Donate For Us