Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert packed integer (16.16) fixed-point to float?

How to convert a "32-bit signed fixed-point number (16.16)" to a float?

Is (fixed >> 16) + (fixed & 0xffff) / 65536.0 ok? What about -2.5? And -0.5?

Or is fixed / 65536.0 the right way?

(PS: How does signed fixed-point "-0.5" looks like in memory anyway?)

like image 309
Ecir Hana Avatar asked Dec 26 '11 20:12

Ecir Hana


People also ask

How do you convert fixed-point to float?

To convert the internal value of a fixed-point number to floating point, simply divide by the scaling factor. To convert the other way, multiply by the scaling factor and round to the nearest integer.

How can you convert an integer to a floating point number?

Integer and Float Conversions To convert the integer to float, use the float() function in Python. Similarly, if you want to convert a float to an integer, you can use the int() function.

How do you change a fixed-point format?

Lossy Conversion of Fixed-Point Numbers To convert from floating-point to fixed-point, we follow this algorithm: Calculate x = floating_input * 2^(fractional_bits) Round x to the nearest whole number (e.g. round(x) ) Store the rounded x in an integer container.

How do you convert a number to a floating point in Python?

Type int(x) to convert x to a plain integer. Type long(x) to convert x to a long integer. Type float(x) to convert x to a floating-point number.


2 Answers

I assume two's complement 32 bit integers and operators working as in C#.

How to do the conversion?

fixed / 65536.0

is correct and easy to understand.


(fixed >> 16) + (fixed & 0xffff) / 65536.0

Is equivalent to the above for positive integers, but slower, and harder to read. You're basically using the distributive law to separate a single division into two divisions, and write the first one using a bitshift.

For negative integers fixed & 0xffff doesn't give you the fractional bits, so it's not correct for negative numbers.

Look at the raw integer -1 which should map to -1/65536. This code returns 65535/65536 instead.


Depending on your compiler it might be faster to do:

fixed * (1/65536.0)

But I assume most modern compilers already do that optimization.

How does signed fixed-point "-0.5" looks like in memory anyway?

Inverting the conversion gives us:

RoundToInt(float*65536)

Setting float=-0.5 gives us: -32768.

like image 168
CodesInChaos Avatar answered Oct 05 '22 00:10

CodesInChaos


class FixedPointUtils {
  public static final int ONE = 0x10000;

  /**
   * Convert an array of floats to 16.16 fixed-point
   * @param arr The array
   * @return A newly allocated array of fixed-point values.
   */
  public static int[] toFixed(float[] arr) {
    int[] res = new int[arr.length];
    toFixed(arr, res);
    return res;
  }

  /**
   * Convert a float to  16.16 fixed-point representation
   * @param val The value to convert
   * @return The resulting fixed-point representation
   */
  public static int toFixed(float val) {
    return (int)(val * 65536F);
  }

  /**
   * Convert an array of floats to 16.16 fixed-point
   * @param arr The array of floats
   * @param storage The location to store the fixed-point values.
   */
  public static void toFixed(float[] arr, int[] storage)
  {
    for (int i=0;i<storage.length;i++) {
      storage[i] = toFixed(arr[i]);
    }
  }

  /**
   * Convert a 16.16 fixed-point value to floating point
   * @param val The fixed-point value
   * @return The equivalent floating-point value.
   */
  public static float toFloat(int val) {
    return ((float)val)/65536.0f;
  }

  /**
   * Convert an array of 16.16 fixed-point values to floating point
   * @param arr The array to convert
   * @return A newly allocated array of floats.
   */
  public static float[] toFloat(int[] arr) {
    float[] res = new float[arr.length];
    toFloat(arr, res);
    return res;
  }

  /**
   * Convert an array of 16.16 fixed-point values to floating point
   * @param arr The array to convert
   * @param storage Pre-allocated storage for the result.
   */
  public static void toFloat(int[] arr, float[] storage)
  {
    for (int i=0;i<storage.length;i++) {
      storage[i] = toFloat(arr[i]);
    }
  }

}
like image 22
Ashok Domadiya Avatar answered Oct 05 '22 01:10

Ashok Domadiya