Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast inverse square root on the iPhone

The fast inverse square function used by SGI/3dfx and most notably in Quake is often cited as being faster than the assembly instruction equivalent, however the posts claiming that seem quite dated. I was curious about its performance on more modern hardware, and particularly on mobile devices like the iPhone. I wouldn't be surprised if the Quake sqrt is not longer a worthwhile optimization on desktop systems, but how about for an iPhone project involving a lot of 3D math? Is it something that would be worthwhile to include?

like image 391
TaylorP Avatar asked Jul 12 '11 15:07

TaylorP


1 Answers

No.

The NEON instruction set (like every other vector ISA*) has a hardware approximate reciprocal square root instruction that is much faster than that oft-cited "trick". Use it instead if reciprocal square root is actually a performance bottleneck in your code (as always, benchmark first; don't spend time optimizing something if you have no hard evidence that its performance matters).

You can get at it by writing your own assembly (inline or otherwise) with the vrsqrte.f32 instruction, or from C, Objective-C, or C++ by including the <arm_neon.h> header and using the vrsqrte_f32( ) intrinsic.

[*] On SSE it's rsqrtss/rsqrtps; on Altivec it's frsqrte/vrsqrte.

like image 177
Stephen Canon Avatar answered Sep 27 '22 20:09

Stephen Canon