I browsed through the recently released Doom 3 BFG source code, when I came upon something that does not appear to make any sense. Doom 3 wraps mathematical functions in the idMath class. Some of the functions just foward to the corresponding functions from math.h
, but some are reimplementations (e.g. idMath::exp16()) that I assume have a higher performance than their math.h
counterparts (maybe at the expense of precision).
What baffles me, however, is the way they have implemented the float idMath::Sqrt(float x)
function:
ID_INLINE float idMath::InvSqrt( float x ) { return ( x > FLT_SMALLEST_NON_DENORMAL ) ? sqrtf( 1.0f / x ) : INFINITY; } ID_INLINE float idMath::Sqrt( float x ) { return ( x >= 0.0f ) ? x * InvSqrt( x ) : 0.0f; }
This appears to perform two unnecessary floating point operations: First a division and then a multiplication.
It is interesting to note that the original Doom 3 source code also implemented the square root function in this way, but the inverse square root uses the fast inverse square root algorithm.
ID_INLINE float idMath::InvSqrt( float x ) { dword a = ((union _flint*)(&x))->i; union _flint seed; assert( initialized ); double y = x * 0.5f; seed.i = (( ( (3*EXP_BIAS-1) - ( (a >> EXP_POS) & 0xFF) ) >> 1)<<EXP_POS) | iSqrt[(a >> (EXP_POS-LOOKUP_BITS)) & LOOKUP_MASK]; double r = seed.f; r = r * ( 1.5f - r * r * y ); r = r * ( 1.5f - r * r * y ); return (float) r; } ID_INLINE float idMath::Sqrt( float x ) { return x * InvSqrt( x ); }
Do you see any advantage in calculating Sqrt(x)
as x * InvSqrt(x)
if InvSqrt(x)
internally just calls math.h
's fsqrt(1.f/x)
? Am I maybe missing something important about denormalized floating point numbers here or is this just sloppiness on id software's part?
I can see two reasons for doing it this way: firstly, the "fast invSqrt" method (really Newton Raphson) is now the method used in a lot of hardware, so this approach leaves open the possibility of taking advantage of such hardware (and doing potentially four or more such operations at once). This article discusses it a little bit:
How slow (how many cycles) is calculating a square root?
The second reason is for compatibility. If you change the code path for calculating square roots, you may get different results (especially for zeroes, NaNs, etc.), and lose compatibility with code that depended on the old system.
As far as I know, the InvSqrt
is used to compute colors in the sense that color depends on the angle from which light bounces off a surface, which gives you some function using the inverse of the square root.
In their case, they don't need huge precision when computing these numbers, so the engineers behind Doom 3's code (originally from Quake III) came up with a very very very fast method of computing an approximation for InvSqrt
using only several Newton-Raphson's iterations.
This is why they use InvSqrt
in all their code, instead of using built-in (slower) functions. I guess the use of x * InvSqrt(x)
is there to avoid multiplying work by two (by having two very efficient functions, one for InvSqrt
and another for Sqrt
).
You should read this article, it might shed some light on this issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With