Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to force PMULHRSW to treat 0x8000 as 1.0 instead of -1.0?

To process 8-bit pixels, to do things like gamma correction without losing information, we normally upsample the values, work in 16 bits or whatever, and then downsample them to 8 bits.

Now, this is a somewhat new area for me, so please excuse incorrect terminology etc.

For my needs I have chosen to work in "non-standard" Q15, where I only use the upper half of the range (0.0-1.0), and 0x8000 represents 1.0 instead of -1.0. This makes it much easier to calculate things in C.

But I ran into a problem with SSSE3. It has the PMULHRSW instruction which multiplies Q15 numbers, but it uses the "standard" range of Q15 is [-1,1-2⁻¹⁵], so multplying (my) 0x8000 (1.0) by 0x4000 (0.5) gives 0xC000 (-0.5), because it thinks 0x8000 is -1. This is quite annoying.

What am I doing wrong? Should I keep my pixel values in the 0000-7FFF range? Doesn't this kind of defeat the purpose of it being a fixed-point format? Is there a way around this? Maybe some trick?

Is there some kind of definitive treatise on Q15 which discusses all this?

like image 666
Alex Avatar asked Aug 29 '12 15:08

Alex


1 Answers

Personally, I'd go with the solution of limiting the max value to 0x7FFF (~0.99something).

  • You don't have to jump through hoops getting the processor to work the way you'd like it
  • You don't have to spend a long time documenting the ins and outs of your "weird" code, as operating over 0-0x7FFF will be immediately recognisable to the readers of your code - Q-format is understood (in my experience) to run from -1.0 to +1.0-one lsb. The arithmetic doesn't work out so well otherwise, as the value of 1 lsb is different on each side of the 0!

Unless you can imagine yourself successfully arguing, to a panel of argumentative code reviewers, that that extra bit is critical to the operation of the algorithm rather than just "the last 0.01% of performance", stick to code everyone can understand, and which maps to the hardware you have available.


Alternatively, re-arrange your previous operation so that the pixels all come out to be the negative of what you originally had. Or the following operations to take in the negative of what you previously sent it. Then use values from -1.0 to 0.0 in Q15 format.

like image 151
Martin Thompson Avatar answered Sep 19 '22 14:09

Martin Thompson