Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the Microsoft Stack always aligned to 16-bytes?

In Assembly Language, Seventh Edition for x86 Processors by Kip Irvine, on page 211, it says under 5.53 The x86 Calling Convention which addresses the Microsoft x64 Calling Convention,

  1. When calling a subroutine, the stack pointer (RSP) must be aligned on a 16-byte boundary (a multiple of 16). The CALL instruction pushes an 8-byte return address on the stack, so the calling program must subtract 8 from the stack pointer, in addition to the 32 it already subtracts for the shadow space.

It goes on to show some assembly with a sub rsp, 8 right before the sub rsp, 20h (for the 32-bytes of shadow space).

Is this a safe convention though? Is the Microsoft stack guaranteed to be aligned on 16-bytes before the CALL instruction? Or, is the book wrong in assuming that the stack was

  1. aligned to 16-bytes prior to the CALL
  2. had an 8-byte return addresses push onto the stack with the CALL
  3. requires an additional sub rsp, 8; to get back to 16-byte alignment?
like image 296
NO WAR WITH RUSSIA Avatar asked Oct 02 '18 19:10

NO WAR WITH RUSSIA


People also ask

What does it mean to be 16 byte aligned?

This effectively means that the address of the memory your data resides in needs to be divisible by the number of bytes required by the instruction. So in your case the alignment is 16 bytes (128 bits), which means the memory address of your data needs to be a multiple of 16.

Is malloc 16 byte aligned?

The GNU documentation states that malloc is aligned to 16 byte multiples on 64 bit systems.

What is stack alignment?

IIRC, stack alignment is when variables are placed on the stack "aligned" to a particular number of bytes.

Why does the stack need to be aligned?

Alignment is limiting addresses where data can be placed and is not limited to stacks. For example, 4-byte alignment would mean that all addresses have the lowest 2 bits always 0. The alignment often corresponds to the memory bus width in the hardware which can be several bytes wide.


1 Answers

I'm asking about meeting the requirements of the x64 ABI. Is it safe to blindly adjust the stack by growing it 8 bytes for a 16-byte alignment after every call.

Yes, that's the whole point of the ABI requiring / guaranteeing 16-byte alignment before a call.


You can do whatever you want inside a function, for example 3x 16-bit pushes and then sub rsp, (24 - 3*2) to regain 16-byte stack alignment after entry to a function.

Or movq xmm0, rsp and then use rsp as an extra scratch register to get 16 total integer regs, until you restore it before making another call or ret.1

There's no requirement that RSP be 16-byte aligned after every instruction, only at function call boundaries. This is the why they're called "calling conventions", not "coding standards".

This is a similar concept to rbx being call-preserved. It doesn't matter if you save/restore it on the stack, in xmm0, in static storage, if you negate it and then negate it back again, or if you don't touch it at all. All that matters is that it has the same value when you return to the caller as it did when your function was called.


Footnote 1: Works as long as you don't have any async callbacks / SEH handlers that could possibly run on the user-space stack. This is not really guaranteed to be safe, but may work as a hack.

Is it valid to write below ESP? is related: as Ped7g points out, if something can asynchronously use space below the stack pointer, it will probably break if RSP isn't pointing to stack memory at all.

I've seen a 32-bit example avisynth video filter (I think) that used this to get 8 tmp regs (when no MMX was available), with big warning comments in the code to debug first before using this trick.

like image 141
Peter Cordes Avatar answered Sep 29 '22 08:09

Peter Cordes