Load 8bit uint8_t as uint32_t?

Question

my image processing project works with grayscale images. I have ARM Cortex-A8 processor platform. I want to make use of the NEON.

I have a grayscale image( consider the example below) and in my alogorithm, I have to add only the columns.

How can I load four 8-bit pixel values in parallel, which are uint8_t, as four uint32_t into one of the 128-bit NEON registers? What intrinsic do I have to use to do this?

I mean:

alt text

I must load them as 32 bits because if you look carefully, the moment I do 255 + 255 is 512, which can't be held in a 8-bit register.

e.g.

255 255 255 255 ......... (640 pixels)
255 255 255 255
255 255 255 255
255 255 255 255
.
.
.
.
.
(480 pixels)

doron · Accepted Answer

I will recommend that you spend a bit of time understanding how SIMD works on ARM. Look at:

Take a look at:

http://blogs.arm.com/software-enablement/161-coding-for-neon-part-1-load-and-stores/
http://blogs.arm.com/software-enablement/196-coding-for-neon-part-2-dealing-with-leftovers/
http://blogs.arm.com/software-enablement/241-coding-for-neon-part-3-matrix-multiplication/
http://blogs.arm.com/software-enablement/277-coding-for-neon-part-4-shifting-left-and-right/

to get you started. You can then implement your SIMD code using inline assembler or corresponding ARM intrinsics recommended by domen.

domen · Answer

Depends on your compiler and (possible lack of) extensions.

Ie. for GCC, this might be a starting point: http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html

Load 8bit uint8_t as uint32_t?

Tags:

arm

intrinsics

neon

cortex-a

HaggarTheHorrible

2 Answers

doron

domen

Recent Activity

Donate For Us

Load 8bit uint8_t as uint32_t?

Tags:

arm

intrinsics

neon

cortex-a

HaggarTheHorrible

2 Answers

doron

domen

Related questions

Recent Activity

Donate For Us