Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IEEE 754/iec 559

Is the IEEE 754 floating point format well defined across platforms? In terms of both bit format and endianness?

I am willing to add the following to my code (for an initial version):

static_assert(std::numeric_limits<float>::is_iec559, "Only support IEC 559 (IEEE 754) float");
static_assert(sizeof(float) * CHAR_BIT == 32, "Only support float => Single Precision IEC 559 (IEEE 754)");

static_assert(std::numeric_limits<double>::is_iec559, "Only support IEC 559 (IEEE 754) double");
static_assert(sizeof(float) * CHAR_BIT == 64, "Only support double => Double Precision IEC 559 (IEEE 754)");

static_assert(std::numeric_limits<long double>::is_iec559, "Only support IEC 559 (IEEE 754) long double");
static_assert(sizeof(float) * CHAR_BIT == 128, "Only support long double  => Exteneded Precision IEC 559 (IEEE 754)");
//  More asserts if required.
//  I noticed my current system has a sizeof(long double) => 128
//  But numeric_limits<long double>::digits  => 63
//  So we are not storing quad precision floats only extended.

If I write my float/double/long double in binary format can these be transported between systems without further interpretation. ie...

void write(std::ostream& stream, double value)
{
     stream.write(reinterpret_cast<char const*>(&value), 8);
}

....

double read(std::istream& stream)
{
     double   value;
     stream.read(reinterpret_cast<char*>(&value), 8);
     return value;
}

Or do I need to break the double up into integer components for transport (as suggested by this answer):

The difference here is I am willing to limit my supported representation to IEEE-754 will this basically solve my binary storage of floating point values or do I need to take further steps?

Note: For non conforming platforms (when I find them) I am willing to special case the code so that they read/write IEEE-754 into local representation. But I want to know if the bit/endian is well enough defined cross platform to support storage/transport.

like image 644
Martin York Avatar asked Feb 20 '15 09:02

Martin York


People also ask

What does IEEE 754 stand for?

IEEE 754. The IEEE Standard for Floating-Point Arithmetic ( IEEE 754) is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard addressed many problems found in the diverse floating-point implementations that made them difficult...

What is the 32/64 bit format in IEEE Standard 754?

Information about the format 32/64 bit in the standard ANSI / IEEE Std 754-1985 §8. Rounding numbers in standard IEEE 754. In presenting the floating-point numbers in IEEE Standard 754 have often rounded numbers. The standard provides four ways to rounding of numbers. Rounding tending to the nearest integer. Rounding tends to zero. Table 3.

Is IEEE 754-2019 the same as ISO 60559?

The international standard ISO/IEC 60559:2020 (with content identical to IEEE 754-2019) has been approved for adoption through JTC1 /SC 25 and published. An IEEE 754 format is a "set of representations of numerical values and symbols". A format may also include how the set is encoded. A floating-point format is specified by:

What is the IEEE754 standard?

IEEE 754 standard is widely used in engineering and programming. Most modern microprocessors are manufactured with hardware realization of representations of real variables in the format of IEEE754. Programming language and the programmer can not change this situation, a repose of a real number in the microprocessor does not exist.


2 Answers

Bit format is well-defined, but not all machines are little-endian. The IEEE standard does not require floating-point numbers to be a certain endian, either. You can run the following program to see the byte pattern of the double 42.0:

#include <stdio.h>
#include <numeric>
#include <limits>
using namespace std;

int main() {
  double d = 42;
  printf("%i\n", std::numeric_limits<double>::is_iec559);
  for (char *c = (char *)&d; c != (char *)(&d+1); c++) {
    printf("%02hhx ", *c);
  }
  printf("\n");
}

On an old, unmaintained Sun machine using g++ 3.4.5, this prints

1
40 45 00 00 00 00 00 00

On an x86_64 machine running a much more recent g++:

1
00 00 00 00 00 00 45 40
like image 92
tmyklebu Avatar answered Sep 21 '22 15:09

tmyklebu


First of all, you may want to change your code such that it properly checks for the type sizes...

static_assert(std::numeric_limits<float>::is_iec559, "Only support IEC 559 (IEEE 754) float");
static_assert(sizeof(float) * CHAR_BIT == 32, "Only support float => Single Precision IEC 559 (IEEE 754)");

static_assert(std::numeric_limits<double>::is_iec559, "Only support IEC 559 (IEEE 754) double");
static_assert(sizeof(double) * CHAR_BIT == 64, "Only support double => Double Precision IEC 559 (IEEE 754)");

static_assert(std::numeric_limits<long double>::is_iec559, "Only support IEC 559 (IEEE 754) long double");
static_assert(sizeof(long double) * CHAR_BIT == 128, "Only support long double  => Exteneded Precision IEC 559 (IEEE 754)");

The thing is, that IEEE-754 does not require long double to be 128 bit long. Depending on the compiler and platform, the length of such type may vary. It does however specify binary128, which may be or may be not supported by the compiler, depending on the platform and the implementation (gcc has a non-standard __float128 type for that). The standard only requires long double to be at least as precise, as double, making it usually 80 bit long (gcc) or 64 (VS).

If you limit your supported representation to IEEE-754, you should not run into any problems.

like image 29
Marandil Avatar answered Sep 20 '22 15:09

Marandil