It's known that some small structs with no non-trivial copy-ctor and no non-trivial dtor are passed in registers.
Quoting ARM Procedural Call Standard:
Fundamental types larger than 32 bits may be passed as parameters to, or returned as the result of, function calls. When these types are in core registers the following rules apply: A double-word sized type is passed in two consecutive registers (e.g., r0 and r1, or r2 and r3). The content of the registers is as if the value had been loaded from memory representation with a single LDM instruction.
And indeed, I can easily confirm this with clang. gcc however emits a bunch of memory loads and stores for such a simple code snippet:
struct Trivial {
int i1;
int i2;
};
int foo(Trivial t)
{
return t.i1 + t.i2;
}
$ clang++ arm.cpp -O2 -mabi=aapcs -c -S && cat arm.s
add r0, r0, r1
bx lr
$ g++ arm.cpp -O2 -mabi=aapcs -c -S && cat arm.s
sub sp, sp, #8
add r3, sp, #8
stmdb r3, {r0, r1}
ldmia sp, {r0, r3}
add r0, r0, r3
add sp, sp, #8
bx lr
I'm using the gcc and clang supplied by ArchlinuxARM distro, running on raspberry pi 2 (gcc 5.2), but I've also reproduced it with gcc based cross-compilers.
This has been confirmed as a gcc bug here, now we wait.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With