We permute a vector in a few places, and we need the distinguished 0 value to use with the vec_perm
built-in. We have not been able to locate a vec_zero()
or similar, so we would like to know how we should handle things.
The code currently use two strategies. The first strategy is a vector load:
__attribute__((aligned(16)))
static const uint8_t z[16] =
{ 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0 };
const uint8x16_p8 zero = vec_ld(0, z);
The second strategy is an xor using the mask we intend to use:
__attribute__((aligned(16)))
static const uint8_t m[16] =
{ 15,14,13,12, 11,10,9,8, 7,6,5,4, 3,2,1,0 };
const uint8x16_p8 mask = vec_ld(0, m);
const uint8x16_p8 zero = vec_xor(mask, mask);
We have not started benchmarks (yet), so we don't know if one is better than the other. The first strategy uses a VMX load and it could be expensive. The second strategy avoids the load but introduces a data dependency.
How do we obtain a VSX value of zero?
I'd suggest to let the compiler handle it for you. Just initialise to zero:
const uint8x16_p8 zero = {0};
- which will likely compile to an xor
.
For example, a simple test:
vector char foo(void)
{
const vector char zero = {0};
return zero;
}
On my machine, this compiles to:
0000000000000000 <foo>:
0: d7 14 42 f0 xxlxor vs34,vs34,vs34
4: 20 00 80 4e blr
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With