Reading through The Tricks of the 3D Game Programming Gurus, I came across this sort function written in inline assembly:
inline float FastSqrt(float Value)
{
float Result;
_asm
{
mov eax, Value
sub eax, 0x3F800000
sar eax, 1
add eax, 0x3F800000
mov Result, eax
}
return(Result);
}
It is an approximation of the actual square-root, but the accuracy is enough for my needs.
How does this actually work? What is this magical 0x3F800000
value? How are we achieving square root by subtracting, rotating and adding?
Here's how it looks in C/C++ code:
inline float FastSqrt_C(float Value)
{
float Result;
long Magic = *((long *)&Value);
Magic -= 0x3F800000;
Magic >>= 1;
Magic += 0x3F800000;
Result = *((float *)&Magic);
return(Result);
}
Many people have pointed out that 0x3f800000
is the representation of 1.0
. While this is true, it has nothing to do with how the calculation works. To understand it, you need to know how non-negative floats are stored. f = (1+m)*2^x
, with 0 <= m < 1
and m
being the mantissa, x
the exponent. Also note that x
is stored with a bias, so what's actually in the binary is x+127
. The 32 bit value is made up from the sign bit (which is zero in our case) followed by 8 bits of exponent storing x+127
and finally by 23 bits of mantissa, m
. (See the wikipedia article).
Apply some basic math,
sqrt(f) = sqrt((1+m)*2^x)
= sqrt(1+m)*sqrt(2^x)
= sqrt(1+m)*2^(x/2)
So, as a rough approximation, we need to halve the exponent but due to the bias we can't just do x/2
we need (x-127)/2 + 127
. This 127
shifted to the appropriate bit position is the magic 0x3f800000
.
The division by 2 is achieved using a right shift by one bit. Since this operates on the whole float, it has a side effect on the mantissa too.
First, assume the original exponent was even. Then the least significant bit that gets shifted out is zero. Thus, the mantissa gets halved too, so what we end up is: sqrt(f) = (1+m/2)*2^(x/2)
. We got the exponent correct, but the mantissa is (1+m/2)
instead of sqrt(1+m)
. The maximum relative error for this is (1.5 - sqrt(2))/sqrt(2) ~ 6%
which occurs if m
is almost 1
meaning f
is near, but less than an odd power of 2
. Take for example f=7.99
. The formula gives us about 2.998
instead of 2.827
which indeed has an error of 6%
.
Now, if the exponent was odd, then the least significant bit will be 1
and this when shifted into the mantissa will cause an increase by half. Thus, we get sqrt(f) = (1.5+m/2)*2^((x-1)/2)
. The maximum error for this is actually when m=0
, and that will be (1.5/sqrt(2)-sqrt(1))/sqrt(1)
which is again around 6%
. This occurs for numbers that are near to an odd power of two from above.
The two cases combined mean the worst inaccuracy is around 6% if the input value happens to be near an odd power of two. For even powers of two, the result is accurate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With