Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Load 8bit uint8_t as uint32_t?

my image processing project works with grayscale images. I have ARM Cortex-A8 processor platform. I want to make use of the NEON.

I have a grayscale image( consider the example below) and in my alogorithm, I have to add only the columns.

How can I load four 8-bit pixel values in parallel, which are uint8_t, as four uint32_t into one of the 128-bit NEON registers? What intrinsic do I have to use to do this?

I mean:

alt text

I must load them as 32 bits because if you look carefully, the moment I do 255 + 255 is 512, which can't be held in a 8-bit register.

e.g.

255 255 255 255 ......... (640 pixels)
255 255 255 255
255 255 255 255
255 255 255 255
.
.
.
.
.
(480 pixels) 
like image 296
HaggarTheHorrible Avatar asked Sep 09 '10 09:09

HaggarTheHorrible


2 Answers

I will recommend that you spend a bit of time understanding how SIMD works on ARM. Look at:

Take a look at:

  1. http://blogs.arm.com/software-enablement/161-coding-for-neon-part-1-load-and-stores/
  2. http://blogs.arm.com/software-enablement/196-coding-for-neon-part-2-dealing-with-leftovers/
  3. http://blogs.arm.com/software-enablement/241-coding-for-neon-part-3-matrix-multiplication/
  4. http://blogs.arm.com/software-enablement/277-coding-for-neon-part-4-shifting-left-and-right/

to get you started. You can then implement your SIMD code using inline assembler or corresponding ARM intrinsics recommended by domen.

like image 68
doron Avatar answered Oct 03 '22 08:10

doron


Depends on your compiler and (possible lack of) extensions.

Ie. for GCC, this might be a starting point: http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html

like image 25
domen Avatar answered Oct 03 '22 08:10

domen