I'm aware of the usual issues with floating point arithmetic and precision loss, so this is not the usual question about why 0.1 + 0.2 != 0.3
and the like.
Instead, I would actually like to implement a binary predicate in C++ (in a 100% standard compliant way) that actually implements a real mathematical equivalence relationship (i.e. being reflexive, transitive, and symmetric), such that two doubles are in the same equivalence class if they represent the exact same value in all respects, distinguishing corner cases like 0.0
and -0.0
but treating all NaN
values as being in the same equivalence class. (In particular, the default ==
is not what I want because is is non-reflexive in the case of NaN
, and does not distinguish between 0.0
and negative -0.0
, which I would like to be in different equivalence classes, as they are actually different values and lead to different runtime behaviors).
What's the shortest and simplest way to do this that does not rely on type punning in any way or any implementation-defined behavior? So far I've got:
#include <cmath>
bool equiv(double x, double y)
{
return (x == y && (x != 0.0 || std::signbit(x) == std::signbit(y))) ||
(std::isnan(x) && std::isnan(y));
}
I believe this handles the corner cases I know about and described earlier, but are there any other corner cases that this doesn't handle that I'm missing? And is the above binary predicate guaranteed to define an equivalence relationship according to the C++ standard, or is any of the behavior unspecified, implementation-defined, etc.?
The IEEE-754 standard describes floating-point formats, a way to represent real numbers in hardware. There are at least five internal formats for floating-point numbers that are representable in hardware targeted by the MSVC compiler. The compiler only uses two of them.
IEEE Standard 754 floating point is the most common representation today for real numbers on computers, including Intel-based PC's, Macs, and most Unix platforms. This is as simple as the name. 0 represents a positive number while 1 represents a negative number.
Floating-Point Number Representation. A floating-point number is typically expressed in the scientific notation, with a fraction ( F ), and an exponent ( E ) of a certain radix ( r ), in the form of F×r^E . Decimal numbers use radix of 10 ( F×10^E ); while binary numbers use radix of 2 ( F×2^E ).
This process is called normalization. For binary formats (which uses only the digits 0 and 1), this non-zero digit is necessarily 1. Therefore, it does not need to be represented in memory; allowing the format to have one more bit of precision.
Looks right.
You can actually get rid of function calls for platforms which implement IEEE 754 (Intel's, Power's and ARMs do) because the special floating point values can be determined without calls.
bool equiv(double x, double y) {
return (x == y && (x || (1 / x == 1 / y))) || (x != x && y != y);
}
The above uses the fact that IEEE:
1 / -0.
yields -infinity
. Infinity special values with the same sign compare equal.The original version though, reads better for most people. Judging from interviewing experience, not every developer knows how special floating point values arise and behave.
If only NaNs had one representation you could just do memcmp
.
With regards to C++ and C language standards, The New C Standard book says:
The term IEEE floating point is often heard. This usage came about because the original standards on this topic were published by the IEEE. This standard for binary floating-point arithmetic is what many host processors have been providing for over a decade. However, its use is not mandated by C99.
The representation for binary floating-point specified in this standard is used by the Intel x86 processor family, Sun SPARC, HP PA-RISC, IBM P OWER PC, HP–was DEC – Alpha, and the majority of modern processors (some DSP processors support a subset, or make small changes, for cost/performance reasons; while others have more substantial differences e.g., TMS320C3x uses two’s complement). There is also a publicly available software implementation of this standard.
Other representations are still supported by processors (IBM 390 and HP–was DEC – VAX) having an existing customer base that predates the publication the documents on which this standard is based. These representations will probably continue to be supported for some time because of the existing code that relies on it (the IBM 390 and HP–was DEC– Alpha support both their companies respective older representations and the IEC 60559 requirements).
There is a common belief that once the IEC 60559 Standard has been specified all of its required functionality will be provided by conforming implementations. It is possible that a C program’s dependencies on IEC 60559 constructs, which can vary between implementations, will not be documented because of this common, incorrect belief (the person writing documentation is not always the person who is familiar with this standard).
Like the C Standard the IEC 60559 Standard does not fully specify the behavior of every construct. It also provides optional behavior for some constructs, such as when underflow is raised, and has optional constructs that an implementation may or may not make use of, such as double standard. C99 does not always provide a method for finding out an implementation’s behavior in these optional areas. For instance, there are no standard macros describing the various options for handling underflow.
What Every Computer Scientist Should Know About Floating-Point Arithmetic says:
Languages and Compilers
Ambiguity
Ideally, a language definition should define the semantics of the language precisely enough to prove statements about programs. While this is usually true for the integer part of a language, language definitions often have a large grey area when it comes to floating-point. Perhaps this is due to the fact that many language designers believe that nothing can be proven about floating-point, since it entails rounding error. If so, the previous sections have demonstrated the fallacy in this reasoning. This section discusses some common grey areas in language definitions, including suggestions about how to deal with them.
... Another ambiguity in most language definitions concerns what happens on overflow, underflow and other exceptions. The IEEE standard precisely specifies the behavior of exceptions, and so languages that use the standard as a model can avoid any ambiguity on this point.
... Another grey area concerns the interpretation of parentheses. Due to round-off errors, the associative laws of algebra do not necessarily hold for floating-point numbers... Whether or not the language standard specifies that parenthesis must be honored, (x+y)+z can have a totally different answer than x+(y+z), as discussed above.
.... rounding can be a problem. The IEEE standard defines rounding very precisely, and it depends on the current value of the rounding modes. This sometimes conflicts with the definition of implicit rounding in type conversions or the explicit round function in languages.
The language standards cannot possibly specify the results of floating point operations because, for example, one can change the rounding mode at run-time using std::fesetround
.
So C and C++ languages have no other choice but to map operations on floating point types directly to hardware instructions and not interfere, like they do. Hence, the languages do not copy IEEE/IEC standard and do not mandate it either.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With