I have the following doubts:
As we know System V x86-64 ABI gives us about a fixed-size area (128 bytes) in the stack frame, so called redzone.
So, as a result we don't need to use, for example, sub rsp, 12
. Just make mov [rsp-12], X
and that's all.
But I cannot grasp idea of that. Why does it matter? Is it necessary to sub rsp, 12
without redzone? After all, stack size is limited at the beginning so why sub rsp, 12
is important? I know that it makes possible us to follow the top of the stack but let's ignore it at that moment.
I know what some instructions use rsp
value ( like ret
) but don't care about it in that moment.
The crux of the problem is: We have no redzone and we've done:
function:
mov [rsp-16], rcx
mov [rsp-32], rcx
mov [rsp-128], rcx
mov [rsp-1024], rcx
ret
Is it difference with?
function:
sub rsp, 1024
mov [rsp-16], rcx
mov [rsp-32], rcx
mov [rsp-128], rcx
mov [rsp-1024], rcx
add rsp, 1024
ret
The "red zone" is not strictly necessary. In your terms, it could be considered "pointless." Everything that you could do using the red zone, you could also do the traditional way that you did it targeting the IA-32 ABI.
Here's what the AMD64 ABI says about the "red zone":
The 128-byte area beyond the location pointed to by
%rsp
is considered to be reserved and shall not be modified by signal or interrupt handlers. Therefore, functions may use this area for temporary data that is not needed across function calls. In particular, leaf functions may use this area for their entire stack frame, rather than adjusting the stack pointer in the prologue and epilogue. This area is known as the red zone.
The real purpose of the red zone is as an optimization. Its existence allows code to assume that the 128 bytes below rsp
will not be asynchronously clobbered by signals or interrupt handlers, which makes it possible to use it as scratch space. This makes it unnecessary to explicitly create scratch space on the stack by moving the stack pointer in rsp
. This is an optimization because the instructions to decrement and restore rsp
can now be elided, saving time and space.
So yes, while you could do this with AMD64 (and would need to do it with IA-32):
function:
push rbp ; standard "prologue" to save the
mov rbp, rsp ; original value of rsp
sub rsp, 32 ; reserve scratch area on stack
mov QWORD PTR [rsp], rcx ; copy rcx into our scratch area
mov QWORD PTR [rsp+8], rdx ; copy rdx into our scratch area
; ...do something that clobbers rcx and rdx...
mov rcx, [rsp] ; retrieve original value of rcx from our scratch area
mov rdx, [rsp+8] ; retrieve original value of rdx from our scratch area
add rsp, 32 ; give back the stack space we used as scratch area
pop rbp ; standard "epilogue" to restore rsp
ret
we don't need to do it in cases where we only need a 128-byte scratch area (or smaller), because then we can use the red zone as our scratch area.
Plus, since we no longer have to decrement the stack pointer, we can use rsp
as the base pointer (instead of rbp
), making it unnecessary to save and restore rbp
(in the prologue and epilogue), and also freeing up rbp
for use as another general-purpose register!
(Technically, turning on frame-pointer omission (-fomit-frame-pointer
, enabled by default with -O1
since the ABI allows it) would also make it possible for the compiler to elide the prologue and epilogue sections, with the same benefits. However, absent a red zone, the need to adjust the stack pointer to reserve space would not change.)
Note, however, that the ABI only guarantees that asynchronous things like signals and interrupt handlers not modify the red zone. Calls to other functions may clobber values in the red zone, so it is not particularly useful except in leaf functions (which those functions that do not call any other functions, as if they were at the "leaf" of a function-call tree).
A final point: the Windows x64 ABI deviates slightly from the AMD64 ABI used on other operating systems. In particular, it has no concept of a "red zone". The area beyond rsp
is considered volatile and subject to be overwritten at any time. Instead, it requires that the caller allocate a home address space on the stack, which is then available for the callee's use in the event that it needs to spill any of the register-passed parameters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With