Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data type compatibility with NEON intrinsics

I am working on ARM optimizations using the NEON intrinsics, from C++ code. I understand and master most of the typing issues, but I am stuck on this one:

The instruction vzip_u8 returns a uint8x8x2_t value (in fact an array of two uint8x8_t). I want to assign the returned value to a plain uint16x8_t. I see no appropriate vreinterpretq intrinsic to achieve that, and simple casts are rejected.

like image 786
Yves Daoust Avatar asked Dec 27 '22 13:12

Yves Daoust


1 Answers

Some definitions to answer clearly...

NEON has 32 registers, 64-bits wide (dual view as 16 registers, 128-bits wide).

The NEON unit can view the same register bank as:

  • sixteen 128-bit quadword registers, Q0-Q15
  • thirty-two 64-bit doubleword registers, D0-D31.

uint16x8_t is a type which requires 128-bit storage thus it needs to be in an quadword register.

ARM NEON Intrinsics has a definition called vector array data type in ARM® C Language Extensions:

... for use in load and store operations, in table-lookup operations, and as the result type of operations that return a pair of vectors.

vzip instruction

... interleaves the elements of two vectors.

vzip Dd, Dm

and has an intrinsic like

uint8x8x2_t vzip_u8 (uint8x8_t, uint8x8_t) 

from these we can conclude that uint8x8x2_t is actually a list of two random numbered doubleword registers, because vzip instructions doesn't have any requirement on order of input registers.

Now the answer is...

uint8x8x2_t can contain non-consecutive two dualword registers while uint16x8_t is a data structure consisting of two consecutive dualword registers which first one has an even index (D0-D31 -> Q0-Q15).

Because of this you can't cast vector array data type with two double word registers to a quadword register... easily.

Compiler may be smart enough to assist you, or you can just force conversion however I would check the resulting assembly for correctness as well as performance.

like image 134
auselen Avatar answered Dec 29 '22 02:12

auselen