Given the following minimal test case:
void exit(int);
int main() {
exit(0);
}
GCC 4.9 and later with 32-bit x86 target produces something like:
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $4, %esp
subl $12, %esp
pushl $0
call exit
Note the convoluted stack-realignment code. With the function renamed to anything but main, however, it gives the (much more reasonable):
xmain:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
subl $12, %esp
pushl $0
call exit
The differences are even more pronounced with -O
. As main
nothing changes; renamed, it yields:
xmain:
subl $24, %esp
pushl $0
call exit
The above was noticed in answering this question:
How do i get rid of call __x86.get_pc_thunk.ax
Is this behavior (and its motivation) documented anywhere, and is there any way to suppress it? GCC has x86 target-specific options to set the preferred/assumed incoming and outgoing stack alignment and enable/disable realignment for arbitrary functions, but they don't seem to be honored for main
.
This answer is based on source diving. I do not know what the developers' intentions or motivations were. All of the code involved seems to date to 2008ish, which is after my own time working on GCC, but long enough ago that people's memories have probably gotten fuzzy. (GCC 4.9 was released in 2014; did you go back any farther than that? If I'm right about when this code was introduced, the clumsy stack alignment for main
should start happening in version 4.4.)
GCC's x86 back end appears to have been coded to make extra-conservative assumptions about the stack alignment on entry to main
, regardless of command-line options. The function ix86_minimum_incoming_stack_boundary
is called to compute the expected stack alignment on entry for each function, and the last thing it does ...
12523 /* Stack at entrance of main is aligned by runtime. We use the
12524 smallest incoming stack boundary. */
12525 if (incoming_stack_boundary > MAIN_STACK_BOUNDARY
12526 && DECL_NAME (current_function_decl)
12527 && MAIN_NAME_P (DECL_NAME (current_function_decl))
12528 && DECL_FILE_SCOPE_P (current_function_decl))
12529 incoming_stack_boundary = MAIN_STACK_BOUNDARY;
12530
12531 return incoming_stack_boundary;
... is override the expected stack alignment to a conservative constant, MAIN_STACK_BOUNDARY
, if the function being compiled is main
. MAIN_STACK_BOUNDARY
is 128 (bits) when compiling 64-bit code and 32 when compiling 32-bit code. As far as I can tell, there is no command-line knob that will make it expect the stack to be more aligned than that on entry to main
. I can persuade it to skip stack alignment for main
by telling it that no additional alignment is needed, compiling your test program with -m32 -mpreferred-stack-boundary=2
gives me
main:
pushl $0
call exit
with GCC 7.3.
The write-only manipulations of %ecx
appear to be a missed-optimization bug. They are coming from this part of ix86_expand_prologue
:
13695 /* Grab the argument pointer. */
13696 t = plus_constant (Pmode, stack_pointer_rtx, m->fs.sp_offset);
13697 insn = emit_insn (gen_rtx_SET (crtl->drap_reg, t));
13698 RTX_FRAME_RELATED_P (insn) = 1;
13699 m->fs.cfa_reg = crtl->drap_reg;
13700 m->fs.cfa_offset = 0;
13701
13702 /* Align the stack. */
13703 insn = emit_insn (ix86_gen_andsp (stack_pointer_rtx,
13704 stack_pointer_rtx,
13705 GEN_INT (-align_bytes)));
13706 RTX_FRAME_RELATED_P (insn) = 1;
13707
The intention is to save a pointer to the incoming argument area before realigning the stack, so that it is straightforward to access arguments. Either because this happens fairly late in the pipeline (after register allocation), or because the instructions are marked FRAME_RELATED, nothing manages to delete those instructions again when they turn out to be unnecessary.
I imagine the GCC devs would at least listen to a bug report about this, but they might reasonably consider it low priority, because these are instructions that are executed only once in the lifetime of the whole program, they're only actually dead when main
doesn't use its arguments, and they only happen in the traditional 32-bit ABI, which I have the impression is considered a second-class target nowadays.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With