I understand the usage of push rbp
...pop rbp
at the start and end of a function to preserve the rbp
value of the calling function, since the rbp
register is callee-preserved. And then I understand the 'convention' of using rbp
as the current top of the stack frame for the current procedure being executed. But related to this I have two questions:
rbp
just a convention? Could I just as easily use r11
(or any other register or even 8 bytes on the stack) as the base of the stack frame? Is there anything special about the rbp
register, or it's just used as the stack frame based upon history and convention?mov %rbp, %rsp
used as a 'cleanup' method before leaving a function? For example, often the push/pop
instructions will be symmetrical, so is the mov %rbp, %rsp
just a shorthand way where someone can 'skip' doing the symmetrical pops/adds and such? What would be an actual usage of where mov %rbp, %rsp
would be useful? Almost all the times I see it in compiler output (with zero optimizations turned on), it seems either unnecessary or redundant, and I'm having trouble thinking of a scenario where it might actually be useful.Optimized code doesn't use frame pointers at all, except for stuff like VLAs / alloca
(variable-sized movement of RSP), or if you specifically use -fno-omit-frame-pointer
(e.g. to make perf record
stack sampling more efficient/reliable). Un-optimized code is usually not as interesting to look at. How to remove "noise" from GCC/clang assembly output?
So there are plenty of duplicates for the part about when / why to use a frame pointer at all. The interesting part is whether a register other than RBP could have been chosen.
The only things special about RBP are that leave
can compactly do RSP=RBP + pop RBP; and that a (%rbp)
addressing mode requires an explicit disp8
or disp32
(with value 0).
So if you are going to use a frame pointer at all, you should pick RBP because it's at least as good as any other reg at being a frame pointer, but worse than other regs for some other uses. You never need 0(frame_pointer)
, only other offsets. (R13 has the same always-needs-a-disp8=0 effect, but then every stack access would always need a REX prefix, like for add -12(%r13), %eax
which doesn't with RBP.)
Also, all other "legacy" registers (that you can use without a REX, i.e. not R8-R15) have at least one implicit use in at least one instruction that compilers may actually generate, like cmpxchg16b
, cpuid
, shl %cl, %reg
, rep movsb
or whatever, so any other reg would be worse as a frame pointer. You can't do simple naive un-optimized (or toy-compiler) code-gen if you need to shuffle things around to free up RBX for some instruction that needs it for a different purpose. (Stack unwinding on exceptions may also rely on the frame pointer always being in a specific register, if your .cfi_*
directives specified that.)
Consistency with previous x86 modes would have been sufficient reason to use RBP, to make it easier for puny human minds to remember, but there are still code-size and other reasons to pick RBP if you're going to use one. (In fact, since (%rsp)
addressing modes always need a SIB byte, the instructions to set up a frame pointer can actually pay for themselves over a large function in terms of code size, although not in instructions / uops.)
Reasons that aren't still relevant:
An RBP base address implies the SS segment, like RSP, which was relevant in 16-bit mode, and theoretically in 32 (where non-flat memory models were possible), but not in 64-bit mode where it only affects the exception you get from a non-canonical address. So that part of the reason is basically gone, pretty much nobody cares about #GP
vs. #SS
there.
enter
is too slow to be usable, but leave
is still worth using if RSP isn't already pointing at the saved RBP, only costing 1 extra uop vs. manual mov %rbp, %rsp
/ pop %rbp
on Intel CPUs, so that's what GCC does. You claim to have seen useless mov %rbp, %rsp
instructions, but that's not what compilers actually do.
Note that mov %rbp, %rsp
(3 bytes) is smaller than add $imm8, %rsp
(4 bytes), so if you're using a frame pointer, you might as well restore RSP that way if it's not pointing at the saved RBP. (Unless you need to restore other registers if you saved them right below RBP instead of after a sub $imm, %rsp
, although you can do the restoring with mov
loads instead of pop.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With