Suppose I have some inline assembly that needs a particular char
value in ah
, bh
, ch
, or dh
. How can I tell GCC to put it there? I don't see a relevant constraint to do that, but the GCC manual says "If you must use a specific register, but your Machine Constraints do not provide sufficient control to select the specific register you want, local register variables may provide a solution", so I tried that:
void f(char x) {
register char y __asm__("ah") = x;
__asm__ __volatile__(
"# my value ended up in %0" :: "a"(y)
);
}
But it didn't work. It put it in al
instead:
movb 4(%esp), %al
# my value ended up in %al
The x86-specific Q
constraint also looks close to what I want, so I tried it in place of a
, but it had the same result. I also tried with the more-generic r
.
Interestingly, when I compile with Clang instead of GCC (whether with a
, Q
, or r
), then I do get the desired result:
movb 4(%esp), %ah
# my value ended up in %ah
I also tried with bh
, ch
, and dh
in place of ah
, and every combination of them led to analogous results.
I also tried compiling as 64-bit instead of 32-bit. There, GCC still does basically the same wrong thing:
movl %edi, %eax
# my value ended up in %al
And Clang utterly failed to compile with Cannot encode high byte register in REX-prefixed instruction
unless I turned off optimizations (which I opened LLVM bug #45865 about), in which case it did eventually get the value in the right place:
movb %dil, -1(%rsp)
movb -1(%rsp), %al
movb %al, -2(%rsp)
movb -2(%rsp), %ah
# my value ended up in %ah
Is this a bug in GCC that I should report, or is this something that's not supposed to work and is only working by chance in Clang? If the latter, is there a way to do what I want, or will I have to settle for mov
ing it there from somewhere else myself from within the assembly?
32-bit Godbolt link. 64-bit Godbolt link.
Apparently, constraints don't allow selecting the nested registers, but you can add an h
modifier to instruction references. This is mentioned in the docs on Input Operands. For example,
void f(char x) {
char a;
__asm__ __volatile__(
"mov %0, %h1" :: "X"(x), "a"(a)
);
}
produces
f:
xorl %eax, %eax
mov 4(%esp), %ah
ret
I haven't been able to get rid of the xor
that clears eax
. My guess is the code generator is interpreting "%h1" as a 32-bit word with 8 bits set, not a character register reference. For example, this:
char f(char x) {
char a;
__asm__ __volatile__(
"movb %0, %h1" :: "X"(x), "a"(a)
);
return a;
}
... compiles to the same code, even though it returns \0
, not very intuitive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With