Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I save a floating-point number in 2 bytes?

Yes I'm aware of the IEEE-754 half-precision standard, and yes I'm aware of the work done in the field. Put very simply, I'm trying to save a simple floating point number (like 52.1, or 1.25) in just 2 bytes.

I've tried some implementations in Java and in C# but they ruin the input value by decoding a different number. You feed in 32.1 and after encode-decode you get 32.0985.

Is there ANY way I can store floating point numbers in just 16-bits without ruining the input value?

Thanks very much.

like image 945
Robin Rodricks Avatar asked May 02 '12 13:05

Robin Rodricks


People also ask

How are floats stored as bytes?

Scalars of type float are stored using four bytes (32-bits). The format used follows the IEEE-754 standard. The mantissa represents the actual binary digits of the floating-point number.

How do you store a floating-point number?

Floating-point numbers are encoded by storing the significand and the exponent (along with a sign bit). Like signed integer types, the high-order bit indicates sign; 0 indicates a positive value, 1 indicates negative. The next 8 bits are used for the exponent.

How many bytes is a floating-point number?

Floating-point numbers use the IEEE (Institute of Electrical and Electronics Engineers) format. Single-precision values with float type have 4 bytes, consisting of a sign bit, an 8-bit excess-127 binary exponent, and a 23-bit mantissa.

Is two bytes a float?

In computing, half precision (sometimes called FP16) is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory.


2 Answers

You could store three digits in BCD and use the remaining four bits for the decimal point position:

52.1 = 521 * 10 ^ -1 => 0x1521
1.25 = 125 * 10 ^ -2 => 0x2125

This would give you a range from 0.0000000000000001 to 999. You can of course add an offset for the decimal point to get for example the range 0.0000000001 to 999000000.


Simple implementation of four bit used for decimal point placement, and the rest for the value. Without any error check, and not thoroughly checked. (May have precision issues with some values when using != to compare doubles.)

public static short Encode(double value) {
  int cnt = 0;
  while (value != Math.Floor(value)) {
    value *= 10.0;
    cnt++;
  }
  return (short)((cnt << 12) + (int)value);
}

public static double Decode(short value) {
  int cnt = value >> 12;
  double result = value & 0xfff;
  while (cnt > 0) {
    result /= 10.0;
    cnt--;
  }
  return result;
}

Example:

Console.WriteLine(Encode(52.1));
Console.WriteLine(Decode(4617));

Output:

4617
52.1
like image 155
Guffa Avatar answered Sep 28 '22 04:09

Guffa


C# has no built in functionality for that, but you could try a fixed point approach.

Example of 8,8 Fixed point (8 before comma, 8 after):

float value = 123.45;
ushort fixedIntValue = (ushort)(value * 256);

This way, the number is stored like this: XXXXXXXX,XXXXXXXX

and you can retrieve the float again using this:

float value = fixedIntValue / 256f;
like image 44
bytecode77 Avatar answered Sep 28 '22 02:09

bytecode77