Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why GCC generates strange way to move stack pointer

I have observed that GCC's C++ compiler generates the following assembler code:

sub    $0xffffffffffffff80,%rsp

This is equivalent to

add    $0x80,%rsp

i.e. remove 128 bytes from the stack.

Why does GCC generate the first sub variant and not the add variant? The add variant seems way more natural to me than to exploit that there is an underflow.

This only occurred once in a quite large code base. I have no minimal C++ code example to trigger this. I am using GCC 7.5.0

like image 409
Heygard Flisch Avatar asked Dec 22 '21 17:12

Heygard Flisch


People also ask

Where does the stack pointer move when a new request is made?

So, the latest request always is placed at the stack top position and the program will get its requests only from the top position. With the entry of new requests, the stack pointer moves ahead to the subsequent physical memory address and the latest element is replicated in the new address location.

What is the use of stack pointer in C?

Use of Stack Pointer 1 The typical usage of the stack pointer is to hold stack bits that belong to the present function. 2 It can be used for both the user (as passed parameters and local variables) and CPU information (returning addresses at... More ...

What is the pointer to the active stack in arm?

In the context of ARM, the register SP R13 is utilized as the pointer to the active stack. 2. Why stack pointer is of 16-bit? PC and SP are utilized to store the memory locations and as the previous location address is 16-bits and so stack pointers are also of 16-bits.

Why stack pointers are 16-bit address?

PC and SP are utilized to store the memory locations and as the previous location address is 16-bits and so stack pointers are also of 16-bits. So that they hold a 16-bit data address.


Video Answer


1 Answers

Try assembling both and you'll see why.

   0:   48 83 ec 80             sub    $0xffffffffffffff80,%rsp
   4:   48 81 c4 80 00 00 00    add    $0x80,%rsp

The sub version is three bytes shorter.

This is because the add and sub immediate instructions on x86 has two forms. One takes an 8-bit sign-extended immediate, and the other a 32-bit sign-extended immediate. See https://www.felixcloutier.com/x86/add; the relevant forms are (in Intel syntax) add r/m64, imm8 and add r/m64, imm32. The 32-bit one is obviously three bytes larger.

The number 0x80 can't be represented as an 8-bit signed immediate; since the high bit is set, it would sign-extend to 0xffffffffffffff80 instead of the desired 0x0000000000000080. So add $0x80, %rsp would have to use the 32-bit form add r/m64, imm32. On the other hand, 0xffffffffffffff80 would be just what we want if we subtract instead of adding, and so we can use sub r/m64, imm8, giving the same effect with smaller code.

I wouldn't really say it's "exploiting an underflow". I'd just interpret it as sub $-0x80, %rsp. The compiler is just choosing to emit 0xffffffffffffff80 instead of the equivalent -0x80; it doesn't bother to use the more human-readable version.

Note that 0x80 is actually the only possible number for which this trick is relevant; it's the unique 8-bit number which is its own negative mod 2^8. Any smaller number can just use add, and any larger number has to use 32 bits anyway. In fact, 0x80 is the only reason that we couldn't just omit sub r/m, imm8 from the instruction set and always use add with negative immediates in its place. I guess a similar trick does come up if we want to do a 64-bit add of 0x0000000080000000; sub will do it, but add can't be used at all, as there is no imm64 version; we'd have to load the constant into another register first.

like image 160
Nate Eldredge Avatar answered Oct 19 '22 04:10

Nate Eldredge