Why will decimal128 be probably standardized and quad precision will not?

Question

This is a very naive question. If we look to the C and C++ standards committees, they are currently working on adding decimal floating point standard types:

link to the C proposal
link to the C++ proposal

So it seems that we will probably have a standardized decimal128 type while we do not yet have any standardized binary128 type (quad precision and not simply extended double precision). Is there a technical reason for this situation or is purely "political"?

Sergey Kalinichenko · Accepted Answer

Quad precision binary floating point is not a substitute for a decimal type. The precision problem is secondary to that of representation of decimal numbers. The idea is to add a type to the languages to support representation of numbers like 0.1 without any loss of precision - something you cannot do with a binary floating point type, no matter how high its precision may be.

That is why the discussion of adding a decimal type is orthogonal to the discussion about adding a quad precision data type: the two types serve different purposes, as discussed in one of the proposals that you linked:

Human computation and communication of numeric values almost always uses decimal arithmetic and decimal notations. Laboratory notes, scientific papers, legal documents, business reports and financial statements all record numeric values in decimal form. When numeric data are given to a program or are displayed to a user, binary to-and-from decimal conversion is required. There are inherent rounding errors involved in such conversions; decimal fractions cannot, in general, be represented exactly by binary floating-point values. These errors often cause usability and efficiency problems, depending on the application.

Dietmar Kühl · Answer

Here are a few simple reasons why there is work on decimal128 and not a binary floating point with 128 bits:

IEEE 754(2008) defines three basic decimal floating point formats (32, 64, and 128 bit). It seems reasonable to standardize interfaces for all three when adding support, especially as there isn't really much difference (well, the 32 bit version doesn't specify arithmetics). The decimal floating point support will be required to use IEEE 754(2008) semantics.
The currently defined floating point format is [still] not required to follow IEEE 754 semantics and doesn't even define the base (there are implementations using base 2 and base 16). For platforms not using IEEE it isn't clear how the format would be extended. Where IEEE 754 is used, it was based on IEEE 754(1984) which only defined two basic formats and there wasn't a proposal mandating a third format.
The current definition of long double is sufficiently vague and it seems unlikely that any of the vendors would accept changing its current meaning to use IEEE 754(2008) 128 bit semantics: it would change the behavior of roughly all implementations. I'd expect objections to mandating use of IEEE 754 for float and double, i.e., any IEEE 754 support for binary floating points would be something entirely new which someone would need to propose. I'd expect such a proposal to be somewhat controversial, e.g., with respect to what names to use and whether to actually add 128 bit support as most users will expect that it receives hardware support and the people working on hardware seem to have other priorities. Note that nobody expects (or should expect) hardware support for decimal floating points: although there is hardware support on Power7 and later processors, no other vendor is contemplating the idea.
I have zero interest in, use for, or experience with using binary 128 bit floating point values. On the other hand I'm interested in and have use for decimal floating points (my experience is somewhat limited but it is certainly bigger than use of binary 128 bit floats). The primary use I have is making it easier to correctly compute with decimal values: yes, I realize that it is possible to correctly use binary floating point and/or integers but in practice hardly anybody does these computations correctly and it nearly trivial to do correct math. Given that the addition of 128 bit binary floating points would require non-trivial work and would potentially endanger a joint proposal, I'm not going to add them. Of course, that doesn't mean that someone else couldn't do the work.
Although binary floating points can be exact, they are mostly used for fast computations and rounding is accepted. Losing a few bits seems to be acceptable. I realized that some applications would benefit from a bigger range of values but that argument would yield unlimited support of bits. This reasoning is different for decimal floating points: the only reason to use them is exact arithmetic and actually a rather limited set of operations normally being used. The not so fast computation is more acceptable than incorrect results. Although 16 digits tend to be sufficient for most uses, there are actually a few uses already which go slightly beyond 16 digits or are quite close. I guess that this reasoning led the people working on IEEE 754 to include 128 bit decimal floating points when decimal floating points were initially added, while a similar reasoning wasn't used when binary floating points were initially standardized.

tl;dr: There is nothing political about decimal 128 bit formats being worked on and binary 128 bit formats not: there is a proposal for one and not the other and the proposer (me) has no interest to write proposals for both.

vinc17 · Answer

There is some work to support IEEE 754-2008 in ISO C, which means that binary128 (and more) may be standardized. See ISO/IEC JTC 1/SC 22/WG 14 N1789. C++ should follow, then.

Now, though binary128 is sometimes implemented, I doubt that it will be used in practice before some time since the current implementations are entirely in software (this may change, though), and there are faster and more flexible ways to get more accurate results: double-double arithmetic or similar ideas (e.g. floating-point expansions, which is more or less a generalization of double-double arithmetic).

Why will decimal128 be probably standardized and quad precision will not?

Tags:

c++

c

floating-point

c++11

standards

Vincent

3 Answers

Sergey Kalinichenko

Dietmar Kühl

vinc17

Recent Activity

Donate For Us

Why will decimal128 be probably standardized and quad precision will not?

Tags:

c++

c

floating-point

c++11

standards

Vincent

3 Answers

Sergey Kalinichenko

Dietmar Kühl

vinc17

Related questions

Recent Activity

Donate For Us