Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++: Clear bits of a single precision float

I'm currently converting a program that was originally intended for OpenCL to C++ and I'm having a bit of trouble with one particular part of it.

One of the expressions commonly used in said program involves taking a 32 bit float, converting it to an integer (i.e. not actually rounding it to an int, but interpreting the same data as an int - think reinterpret_cast), performing some bit twiddling magic on it and then converting it back to a float (once again, not actual conversion, but reinterpretation of the same data). While this works well in OpenCL, with C++ and gcc this violates strict aliasing rules, breaking the program if optimization is enabled and, depending on the architecture, may involve an expensive load-hit-store since float and integer registers are separated.

I've been able to avoid most of these expressions efficiently, but there is one I'm not sure about whether it could be done faster. Basically, the intention is to clear a number of bits from the right of a float; the OpenCL code does this similar to this:

float ClearFloatBits(float Value, int NumberOfBits) {
    return __int_as_float((__float_as_int(Value) >> NumberOfBits) << NumberOfBits);
}

Since this is essentially rounding down from a specified (binary) digit, my C++ version now looks like this:

float ClearFloatBits(float Value, int NumberOfBits) {
    float Factor = pow(2.0f, 23 - NumberOfBits);

    return ((int)(Value*Factor))/Factor;
}

Where the pow and the division are of course replaced by a LUT lookup and a respective multiplication, here omitted for better readability.

Is there a better way to do this? What bugs me in particular is the (int) conversion to round down, which I guess is the most expensive part. It is guaranteed that the float passed to the function is a number between 1.0 (inclusive) and 2.0 (exclusive), if that helps.

Thanks in advance

like image 947
Benedikt Bitterli Avatar asked Aug 26 '11 11:08

Benedikt Bitterli


2 Answers

Use the union hack instead:

float ClearFloatBits(float Value, int NumberOfBits) {
   union { unsigned int int_val; float flt_val; } union_hack;
   union_hack.flt_val = Value;
   (union_hack.int_val >>= NumberOfBits) <<= NumberOfBits;
   return union_hack.flt_val;
}

Strictly speaking, this is undefined behavior. Per both the C and C++ standards, it is illegal to write the result of writing to one member of a union and then reading from another member without first writing to that other member is undefined.

However, this usage of unions is so widespread and so ancient that no compiler writer that I know of obeys the standard. In practice, the behavior is very well defined and is exactly what you would expect. That said, this hack might not work if ported to some very strange architecture machine that uses a very strictly conforming compiler.

like image 75
David Hammen Avatar answered Sep 25 '22 16:09

David Hammen


Reinterpreting as an int violates aliasing rules. Reinterpreting as a unsigned char[4] doesn't. Do you need to support NumberOfBits values >=8 ? If not, you can just do the bitshift on ptr[3]

like image 34
MSalters Avatar answered Sep 21 '22 16:09

MSalters