I'm suffering GCC inline assembly on PowerPC. The program compiles fine with -g2 -O3
, but fails to compile with -g3 -O0
. The problem is, I need to observe it under the debugger so I need symbols without optimizations.
Here is the program:
$ cat test.cxx
#include <altivec.h>
#undef vector
typedef __vector unsigned char uint8x16_p;
uint8x16_p VectorFastLoad8(const void* p)
{
long offset = 0;
uint8x16_p res;
__asm(" lxvd2x %x0, %1, %2 \n\t"
: "=wa" (res)
: "g" (p), "g" (offset/4), "Z" (*(const char (*)[16]) p));
return res;
}
And here's the error. (The error has existed since PowerPC vec_xl_be replacement using inline assembly, but I have been able to ignore it until now).
$ g++ -g3 -O0 -mcpu=power8 test.cxx -c
/home/test/tmp/ccWvBTN4.s: Assembler messages:
/home/test/tmp/ccWvBTN4.s:31: Error: operand out of range (64 is not between 0 and 31)
/home/test/tmp/ccWvBTN4.s:31: Error: syntax error; found `(', expected `,'
/home/test/tmp/ccWvBTN4.s:31: Error: junk at end of line: `(31),32(31)'
I believe this is the sore spot from the *.s listing:
#APP
# 12 "test.cxx" 1
lxvd2x 0, 64(31), 32(31)
There's some similar issues reported when using lwz
, but I have not found one discussing problems with lxvd2x
.
What is the problem and how do I fix it?
Here's the head of the *.s
file:
$ head -n 40 test.s
.file "test.cxx"
.abiversion 2
.section ".toc","aw"
.align 3
.section ".text"
.machine power8
.Ltext0:
.align 2
.globl _Z15VectorFastLoad8PKv
.type _Z15VectorFastLoad8PKv, @function
_Z15VectorFastLoad8PKv:
.LFB0:
.file 1 "test.cxx"
.loc 1 7 0
.cfi_startproc
std 31,-8(1)
stdu 1,-96(1)
.cfi_def_cfa_offset 96
.cfi_offset 31, -8
mr 31,1
.cfi_def_cfa_register 31
std 3,64(31)
.LBB2:
.loc 1 8 0
li 9,0
std 9,32(31)
.loc 1 12 0
ld 9,64(31)
#APP
# 12 "test.cxx" 1
lxvd2x 0, 64(31), 32(31)
# 0 "" 2
#NO_APP
xxpermdi 0,0,0,2
li 9,48
stxvd2x 0,31,9
.loc 1 13 0
li 9,48
lxvd2x 0,31,9
Here's the code generated at -O3
:
$ g++ -g3 -O3 -mcpu=power8 test.cxx -save-temps -c
$ objdump --disassemble test.o | c++filt
test.o: file format elf64-powerpcle
Disassembly of section .text:
0000000000000000 <VectorFastLoad8(void const*)>:
0: 99 06 43 7c lxvd2x vs34,r3,r0
4: 20 00 80 4e blr
8: 00 00 00 00 .long 0x0
c: 00 09 00 00 .long 0x900
10: 00 00 00 00 .long 0x0
The issue is that the generated asm has register+offset operands for RA and RB, but the lxvd2x
instruction only takes direct register addresses (ie, no offsets).
It looks like you've got your constraints wrong there. Looking at the inline asm:
__asm(" lxvd2x %x0, %1, %2 \n\t"
: "=wa" (res)
: "g" (p), "g" (offset/4), "Z" (*(const char (*)[16]) p));
Firstly, you have one output operand and three input operands (so four in total), but only three operands used in your template.
I'm assuming that your function reads directly from *p
, and it doesn't clobber anything, so it looks like this is an unused operand for indicating a potential memory access (more on that below). We'll keep it simple for now; dropping it gives us:
__asm(" lxvd2x %x0, %1, %2 \n\t"
: "=wa" (res)
: "g" (p), "g" (offset/4));
Compiling that, I still get an offset used for the RA and/or RB:
lxvd2x 0, 40(31), 9
Looking at the docs for the "g"
constraint, we see:
'g':
Any register, memory or immediate integer operand is allowed, except for registers that are not general registers.
However, we can't provide a memory operand here; only a register (without offset) is allowed. If we change the constraint to "r"
:
__asm(" lxvd2x %x0, %1, %2 \n\t"
: "=wa" (res)
: "r" (p), "r" (offset/4));
For me, this compiles to a valid lxvd2x
invocation:
lxvd2x 0, 9, 10
- which the assembler happily accepts.
Now, as @PeterCordes has commented, this example no longer indicates that it may access memory, so we should restore that memory input dependency, giving:
__asm(" lxvd2x %x0, %1, %2 \n\t"
: "=wa" (res)
: "r" (p), "r" (offset/4), "m" (*(const char (*)[16]) p));
In effect, all we've done is alter the constraints from "g"
to "r"
, forcing the compiler to use non-offset register operands.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With