I was writing a vsprintf function to use my 64-bit OS kernel (written by C), and checked that it works well in Visual Studio and Cygwin gcc. Then, I put to my kernel and run... but kernel doesn't works well
I debugged and figured out the problem: vsprintf contains next assembly code
movdqa xmm0,XMMWORD PTR [rip+0x0]
The real problem is that I NEVER use floating point!
I guess that was gcc's optimization, and It seems to be correct because It works well without optimization.
Is there any solution, so to speak, gcc option that disable optimization with xmm registers?
The XMM register move instructions are generated, because in the System V AMD64 ABI, floating point arguments are stored in XMM0–XMM7.
Since we don't know if floating points are used just by looking at the variadic function, the compiler needs to generate instructions to push the floating point values to the va_list
as well.
You could use the -mno-sse
flag to disable SSE. For example,
__attribute__((noinline))
void f(const char* x, ...) {
va_list va;
va_start(va, x);
vprintf(x, va);
va_end(va);
}
Without the -mno-sse
flag:
subq $0x000000d8,%rsp
testb %al,%al
movq %rsi,0x28(%rsp)
movq %rdx,0x30(%rsp)
movq %rcx,0x38(%rsp)
movq %r8,0x40(%rsp)
movq %r9,0x48(%rsp)
je 0x100000f1b
movaps %xmm0,0x50(%rsp)
movaps %xmm1,0x60(%rsp)
movaps %xmm2,0x70(%rsp)
movaps %xmm3,0x00000080(%rsp)
movaps %xmm4,0x00000090(%rsp)
movaps %xmm5,0x000000a0(%rsp)
movaps %xmm6,0x000000b0(%rsp)
movaps %xmm7,0x000000c0(%rsp)
0x100000f1b:
leaq 0x000000e0(%rsp),%rax
movl $0x00000008,0x08(%rsp)
movq %rax,0x10(%rsp)
leaq 0x08(%rsp),%rsi
leaq 0x20(%rsp),%rax
movl $0x00000030,0x0c(%rsp)
movq %rax,0x18(%rsp)
callq 0x100000f6a ; symbol stub for: _vprintf
addq $0x000000d8,%rsp
ret
With the -mno-sse
flag:
subq $0x58,%rsp
leaq 0x60(%rsp),%rax
movq %rsi,0x28(%rsp)
movq %rax,0x10(%rsp)
leaq 0x08(%rsp),%rsi
leaq 0x20(%rsp),%rax
movq %rdx,0x30(%rsp)
movq %rcx,0x38(%rsp)
movq %r8,0x40(%rsp)
movq %r9,0x48(%rsp)
movl $0x00000008,0x08(%rsp)
movq %rax,0x18(%rsp)
callq 0x100000f6a ; symbol stub for: _vprintf
addq $0x58,%rsp
ret
You could also use the target
attribute to disable SSE just for that function, e.g.
__attribute__((noinline, target("no-sse")))
// ^^^^^^^^^^^^^^^^
void f(const char* x, ...) {
va_list va;
va_start(va, x);
vprintf(x, va);
va_end(va);
}
But be warned that other functions with SSE support won't know f
doesn't use SSE, and thus calling them with floating point numbers will cause undefined behavior:
int main() {
f("%g %g", 1.0, 2.0); // 1.0 and 2.0 are stored in XMM0–1
// So this will print garbage e.g. `0 6.95326e-310`
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With