Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add scalar in neon?

Tags:

simd

arm

neon

I want to do addition using scalar. Here is what I've tried:

ex) uint32x4_t result, result2, op, one;

// op + 1

result = vaddq_u32(op, 1); //error, 1 is not vector

one = vdupq_n_u32(1);

result2 = vaddq_u32(op, one); // ok

What is the best way to save memory space when doing this?

like image 887
San_kim Avatar asked Oct 11 '25 20:10

San_kim


1 Answers

There are no instructions for vector-scalar alu type operations, only multiplications of >= 16bit width on NEON.

Neither are there instructions for add/sub by immediate values.

What you already did is the way it is supposed to be done.

One thing you could try to boost the performance is to declare the vector of 1s as a constant outside of the loop, hoping the compiler to be smart enough not to load the same value over and over each iteration within the loop.

Unfortunately, the available ARM compilers aren't that reliable when in comes to NEON. Checking the disassembly is pretty much a necessety which defeats the point of writing in intrinsics in the first place.

like image 63
Jake 'Alquimista' LEE Avatar answered Oct 15 '25 00:10

Jake 'Alquimista' LEE