Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Portability of binary serialization of double/float type in C++

Brian "Beej Jorgensen" Hall gives in his Guide to Network Programming some code to pack float (resp. double) to uint32_t (resp. uint64_t) to be able to safely transmit it over the network between two machine that may not both agree to their representation. It has some limitation, mainly it does not support NaN and infinity.

Here is his packing function:

#define pack754_32(f) (pack754((f), 32, 8))
#define pack754_64(f) (pack754((f), 64, 11))

uint64_t pack754(long double f, unsigned bits, unsigned expbits)
{
    long double fnorm;
    int shift;
    long long sign, exp, significand;
    unsigned significandbits = bits - expbits - 1; // -1 for sign bit

    if (f == 0.0) return 0; // get this special case out of the way

    // check sign and begin normalization
    if (f < 0) { sign = 1; fnorm = -f; }
    else { sign = 0; fnorm = f; }

    // get the normalized form of f and track the exponent
    shift = 0;
    while(fnorm >= 2.0) { fnorm /= 2.0; shift++; }
    while(fnorm < 1.0) { fnorm *= 2.0; shift--; }
    fnorm = fnorm - 1.0;

    // calculate the binary form (non-float) of the significand data
    significand = fnorm * ((1LL<<significandbits) + 0.5f);

    // get the biased exponent
    exp = shift + ((1<<(expbits-1)) - 1); // shift + bias

    // return the final answer
    return (sign<<(bits-1)) | (exp<<(bits-expbits-1)) | significand;
}

What's wrong with a human readable format.

It has a couple of advantages over binary:

  • It's readable
  • It's portable
  • It makes support really easy
    (as you can ask the user to look at it in their favorite editor even word)
  • It's easy to fix
    (or adjust files manually in error situations)

Disadvantage:

  • It's not compact
    If this a real problem you can always zip it.
  • It may be slightly slower to extract/generate
    Note a binary format probably needs to be normalized as well (see htonl())

To output a double at full precision:

double v = 2.20;
std::cout << std::setprecision(std::numeric_limits<double>::digits) << v;

OK. I am not convinced that is exactly precise. It may lose precision.


Take a look at the (old) gtypes.h file implementation in glib 2 - it includes the following:

#if G_BYTE_ORDER == G_LITTLE_ENDIAN
union _GFloatIEEE754
{
  gfloat v_float;
  struct {
    guint mantissa : 23;
    guint biased_exponent : 8;
    guint sign : 1;
  } mpn;
};
union _GDoubleIEEE754
{
  gdouble v_double;
  struct {
    guint mantissa_low : 32;
    guint mantissa_high : 20;
    guint biased_exponent : 11;
    guint sign : 1;
  } mpn;
};
#elif G_BYTE_ORDER == G_BIG_ENDIAN
union _GFloatIEEE754
{
  gfloat v_float;
  struct {
    guint sign : 1;
    guint biased_exponent : 8;
    guint mantissa : 23;
  } mpn;
};
union _GDoubleIEEE754
{
  gdouble v_double;
  struct {
    guint sign : 1;
    guint biased_exponent : 11;
    guint mantissa_high : 20;
    guint mantissa_low : 32;
  } mpn;
};
#else /* !G_LITTLE_ENDIAN && !G_BIG_ENDIAN */
#error unknown ENDIAN type
#endif /* !G_LITTLE_ENDIAN && !G_BIG_ENDIAN */

glib link


Just write the binary IEEE754 representation to disk, and document this as your storage format (along with is endianness). Then it's up to the implementation to convert this to its internal representation if necessary.