How to implement fast inverse sqrt without undefined behavior? [duplicate]

Question

From what I understood about strict aliasing rule, this code for fast inverse square root will result in undefined behavior in C++:

float Q_rsqrt( float number )
{
    long i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    y  = number;
    i  = * ( long * ) &y; // type punning
    i  = 0x5f3759df - ( i >> 1 );
    y  = * ( float * ) &i;
    y  = y * ( threehalfs - ( x2 * y * y ) );

    return y;
}

Does this code indeed cause UB? If yes, how can it be reimplemented in standard compliant way? If not, why not?

Assumptions: before calling this function we have somehow checked that floats are in IEEE 754 32-bit format, sizeof(long)==sizeof(float) and the platform is little-endian.

Christian Rau · Accepted Answer

The standard compliant way would be std::memcpy. This should be sufficiently standard-compliant under your specified assumptions. Any reasonable compiler will turn that into a bunch of register moves if possible anyway. Furthermore, we can also alleviate (or at least check) some of your made assumptions using C++11's static_assert and fixed width integer types from <cstdint>. Endianness is irrelevant anyway, since we're not working on any arrays here, if an integer type is little-endian, a floating point type is also.

float Q_rsqrt( float number )
{
    static_assert(std::numeric_limits<float>::is_iec559, 
                  "fast inverse square root requires IEEE-comliant 'float'");
    static_assert(sizeof(float)==sizeof(std::uint32_t), 
                  "fast inverse square root requires 'float' to be 32-bit");
    float x2 = number * 0.5F, y = number;
    std::uint32_t i;
    std::memcpy(&i, &y, sizeof(float));
    i  = 0x5f3759df - ( i >> 1 );
    std::memcpy(&y, &i, sizeof(float));
    return y * ( 1.5F - ( x2 * y * y ) );
}

yuri kilochek · Answer

You should use memcpy. AFAIK this is the only standard compliant way and compilers are smart enough to replace the call with a single word move instruction. See this question for reasoning behind these claims.

Angew is no longer proud of SO · Answer

I don't think you can do this without UB in the strict standardese sense, simply because it relies on things like a particular value representation of float and long. The standard does not specify these (on purpose), and to give implementations the freedom to do what they see fit in this regard, specifically applies the "UB" label to all behaviour which would depend on such things.

That doesn't mean this will be a "Heisenbug" or "nasal demons" type of UB; most likely, the implementation will provide definitions for this behaviour. As long as these definitions satisfy the preconditions of the code, it's fine. The preconditions here being:

Type punning is allowed by the implementation.
sizeof(float) == sizeof(long).
The internal representation of float is such that if the bits are interpreted as a long, 0x5f3759df and >> 1 have the meaning reuqired for this hack to function.

However, it is impossible to re-write this in a fully portable way (i.e. without UB) - what would you do about platforms which use a different implementation of floating-point arithmetic, for example. As long as you're relying a particular platform's specifics, you're relying on behaviour which is undefined in the standard, but defined by the platform.

MSalters · Answer

Not. The algorithm starts out by assuming IEEE754 layout, which isn't a valid Standard-compliant assumption. The particular constant 0x5f3759df would be utterly wrong for any other layout. Even the assumption that such a constant exists is flawed. I think it breaks as soon as you'd reverse the order of the fields inside a float, for instance.

I'm also not sure if Q_rsqrt(-0.0f) works correctly even assuming IEEE754. Both zeroes (positive and negative) should be equal and sqrt(x) is only undefined for x<-0.0f

How to implement fast inverse sqrt without undefined behavior? [duplicate]

Tags:

c++

undefined-behavior

strict-aliasing

Ruslan

4 Answers

Christian Rau

yuri kilochek

Angew is no longer proud of SO

MSalters

Recent Activity

Donate For Us

How to implement fast inverse sqrt without undefined behavior? [duplicate]

Tags:

c++

undefined-behavior

strict-aliasing

Ruslan

4 Answers

Christian Rau

yuri kilochek

Angew is no longer proud of SO

MSalters

Related questions

Recent Activity

Donate For Us