Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why RBP instead of another register as a frame pointer?

I understand the usage of push rbp...pop rbp at the start and end of a function to preserve the rbp value of the calling function, since the rbp register is callee-preserved. And then I understand the 'convention' of using rbp as the current top of the stack frame for the current procedure being executed. But related to this I have two questions:

  1. Is rbp just a convention? Could I just as easily use r11 (or any other register or even 8 bytes on the stack) as the base of the stack frame? Is there anything special about the rbp register, or it's just used as the stack frame based upon history and convention?
  2. Why is mov %rbp, %rsp used as a 'cleanup' method before leaving a function? For example, often the push/pop instructions will be symmetrical, so is the mov %rbp, %rsp just a shorthand way where someone can 'skip' doing the symmetrical pops/adds and such? What would be an actual usage of where mov %rbp, %rsp would be useful? Almost all the times I see it in compiler output (with zero optimizations turned on), it seems either unnecessary or redundant, and I'm having trouble thinking of a scenario where it might actually be useful.
like image 277
carl.hiass Avatar asked Sep 04 '25 03:09

carl.hiass


1 Answers

Optimized code doesn't use frame pointers at all, except for stuff like VLAs / alloca (variable-sized movement of RSP), or if you specifically use -fno-omit-frame-pointer (e.g. to make perf record stack sampling more efficient/reliable). Un-optimized code is usually not as interesting to look at. How to remove "noise" from GCC/clang assembly output?

  • x86_64 : is stack frame pointer almost useless?
  • Why is it better to use the ebp than the esp register to locate parameters on the stack? (only for code-size)
  • What are the advantages of a frame pointer?

So there are plenty of duplicates for the part about when / why to use a frame pointer at all. The interesting part is whether a register other than RBP could have been chosen.


The only things special about RBP are that leave can compactly do RSP=RBP + pop RBP; and that a (%rbp) addressing mode requires an explicit disp8 or disp32 (with value 0).

So if you are going to use a frame pointer at all, you should pick RBP because it's at least as good as any other reg at being a frame pointer, but worse than other regs for some other uses. You never need 0(frame_pointer), only other offsets. (R13 has the same always-needs-a-disp8=0 effect, but then every stack access would always need a REX prefix, like for add -12(%r13), %eax which doesn't with RBP.)

Also, all other "legacy" registers (that you can use without a REX, i.e. not R8-R15) have at least one implicit use in at least one instruction that compilers may actually generate, like cmpxchg16b, cpuid, shl %cl, %reg, rep movsb or whatever, so any other reg would be worse as a frame pointer. You can't do simple naive un-optimized (or toy-compiler) code-gen if you need to shuffle things around to free up RBX for some instruction that needs it for a different purpose. (Stack unwinding on exceptions may also rely on the frame pointer always being in a specific register, if your .cfi_* directives specified that.)

Consistency with previous x86 modes would have been sufficient reason to use RBP, to make it easier for puny human minds to remember, but there are still code-size and other reasons to pick RBP if you're going to use one. (In fact, since (%rsp) addressing modes always need a SIB byte, the instructions to set up a frame pointer can actually pay for themselves over a large function in terms of code size, although not in instructions / uops.)


Reasons that aren't still relevant:

An RBP base address implies the SS segment, like RSP, which was relevant in 16-bit mode, and theoretically in 32 (where non-flat memory models were possible), but not in 64-bit mode where it only affects the exception you get from a non-canonical address. So that part of the reason is basically gone, pretty much nobody cares about #GP vs. #SS there.

enter is too slow to be usable, but leave is still worth using if RSP isn't already pointing at the saved RBP, only costing 1 extra uop vs. manual mov %rbp, %rsp / pop %rbp on Intel CPUs, so that's what GCC does. You claim to have seen useless mov %rbp, %rsp instructions, but that's not what compilers actually do.

Note that mov %rbp, %rsp (3 bytes) is smaller than add $imm8, %rsp (4 bytes), so if you're using a frame pointer, you might as well restore RSP that way if it's not pointing at the saved RBP. (Unless you need to restore other registers if you saved them right below RBP instead of after a sub $imm, %rsp, although you can do the restoring with mov loads instead of pop.)

like image 193
Peter Cordes Avatar answered Sep 08 '25 04:09

Peter Cordes