I've noticed a really weird behavior when I was playing with libc's system() function on x86-64 linux, sometimes the call to system()
fails with a segmentation fault, here's what I got after debugging it with gdb
.
I've noticed that the segmentation fault is cased in this line:
=> 0x7ffff7a332f6 <do_system+1094>: movaps XMMWORD PTR [rsp+0x40],xmm0
According to the manual, this is the cause of the SIGSEGV:
When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) is generated.
Looking deeper down, I've noticed that indeed my rsp
value was not 16 byte padded (that is, its hex representation didn't end with 0
). Manually modifying the rsp
right before the call to system
actually makes everything work.
So I've written the following program:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
register long long int sp asm ("rsp");
printf("%llx\n", sp);
if (sp & 0x8) /* == 0x8*/
{
printf("running system...\n");
system("touch hi");
}
return 0;
}
Compiled with gcc 7.3.0 And sure enough, when observing the output:
sha@sha-desktop:~/Desktop/tda$ ltrace -f ./o_sample2
[pid 26770] printf("%llx\n", 0x7ffe3eabe6c87ffe3eabe6c8
) = 13
[pid 26770] puts("running system..."running system...
) = 18
[pid 26770] system("touch hi" <no return ...>
[pid 26771] --- SIGSEGV (Segmentation fault) ---
[pid 26771] +++ killed by SIGSEGV +++
[pid 26770] --- SIGCHLD (Child exited) ---
[pid 26770] <... system resumed> ) = 139
[pid 26770] +++ exited (status 0) +++
So with this program, I cannot execute system()
what so ever.
Small thing also, and I cannot tell if its relevant to the problem, almost all of my runs end up with a bad rsp
value and a child that is killed by SEGSEGV.
This makes me wonder a few things:
system
mess around with the xmm
s registers? system()
function properly?Thanks in advance
The x86-64 System V ABI guarantees 16-byte stack alignment before a call
, so libc system
is allowed to take advantage of that for 16-byte aligned loads/stores. If you break the ABI, it's your problem if things crash.
On entry to a function, after a call
has pushed a return address, RSP+-8 is 16-byte aligned, and one more push
will set you up to call another function.
GCC of course normally has no problem doing this, by using either an odd number of push
es or using a sub rsp, 16*n + 8
to reserve stack space. Using a register-asm local variable with asm("rsp")
doesn't break this, as long as you only read the variable, not assign to it.
You say you're using GCC7.3. I put your code on the Godbolt compiler explorer and compiled it with -O3
, -O2
, -O1
, and -O0
. It follows the ABI at all optimization levels, making a main
that starts with sub rsp, 8
and doesn't modify RSP inside the function (except for call
), until the end of the function.
So does every other version and optimization level of clang and gcc I checked.
This is gcc7.3 -O3's code-gen: note that it does not do anything to RSP except read it inside the function body, so if main
is called with a valid RSP (16-byte aligned - 8), all of main
's function calls will also be made with 16-byte aligned RSP. (And it will never find sp & 8
true, so it will never call system
in the first place.)
# gcc7.3 -O3
main:
sub rsp, 8
xor eax, eax
mov edi, OFFSET FLAT:.LC0
mov rsi, rsp # read RSP.
call printf
test spl, 8 # low 8 bits of RSP
je .L2
mov edi, OFFSET FLAT:.LC1
call puts
mov edi, OFFSET FLAT:.LC2
call system
.L2:
xor eax, eax
add rsp, 8
ret
If you're calling main
in some non-standard way, you're violating the ABI. And you don't explain it in the question, so this is not a MCVE.
As I explained in Does the C++ standard allow for an uninitialized bool to crash a program?, compilers are allowed to emit code that takes advantage of any guarantees the ABI of the target platform makes. This includes using movaps
for 16-byte loads/stores to copy stuff around on the stack, taking advantage of the incoming alignment guarantee.
It's a missed optimization that gcc doesn't optimize away the if()
entirely, like clang
does.
But clang's really treating it as an uninitialized variable; without using it in an asm
statement, so the register-local asm("rsp")
is not having any effect for clang, I think. Clang leaves RSI unmodified before the first printf
call, so clang's main
actually prints argv
, never reading RSP at all.
Clang is allowed to do this: the only supported use for register-asm local vars is making "r"(var)
extended-asm constraints pick the register you want. (https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html).
The manual doesn't imply that simply using such a variable other times can be problematic, so I think this code should be safe in general according to the written rules, as well as happening to work in practice.
The manual does say that using a call-clobbered register (like "rcx"
on x86) would lead to the variable being clobbered by function calls, so perhaps a variable using rsp
would be affected by compiler-generated push/pop?
This is an interesting test-case: see it on the Godbolt link.
// gcc won't compile this: "error: unable to find a register to spill"
// clang simply copies the value back out of RDX before idiv
int sink;
int divide(int a, int b) {
register long long int dx asm ("rdx") = b;
asm("" : "+r"(dx)); // actually make the compiler put the value in RDX
sink = a/b; // IDIV uses EDX as an input
return dx;
}
Without the asm("" : "+r"(dx));
, gcc compiles it just fine, never putting b
into RDX at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With