Yeah, I meant to say 80-bit. That's not a typo...
My experience with floating point variables has always involved 4-byte multiples, like singles (32 bit), doubles (64 bit), and long doubles (which I've seen referred to as either 96-bit or 128-bit). That's why I was a bit confused when I came across an 80-bit extended precision data type while I was working on some code to read and write to AIFF (Audio Interchange File Format) files: an extended precision variable was chosen to store the sampling rate of the audio track.
When I skimmed through Wikipedia, I found the link above along with a brief mention of 80-bit formats in the IEEE 754-1985 standard summary (but not in the IEEE 754-2008 standard summary). It appears that on certain architectures "extended" and "long double" are synonymous.
One thing I haven't come across are specific applications that make use of extended precision data types (except for, of course, AIFF file sampling rates). This led me to wonder:
Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. Extended precision formats support a basic format by minimizing roundoff and overflow errors in intermediate values of expressions on the base format.
For extended precision : It requires 80 bits.
The description of binary numbers in the exponential form is called floating-point representation. The floating-point representation breaks the number into two parts, the left-hand side is a signed, fixed-point number known as a mantissa and the right-hand side of the number is known as the exponent.
Floating-point decimal values generally do not have an exact binary representation. This is a side effect of how the CPU represents floating point data. For this reason, you may experience some loss of precision, and some floating-point operations may produce unexpected results.
Intel's FPUs use the 80-bit format internally to get more precision for intermediate results.
That is, you may have 32-bit or 64-bit variables, but when they are loaded into the FPU registers, they are converted to 80 bit; the FPU then (by default) performs all calculations in 80 but; after the calculation, the result is stored back into a 32-bit or 64-bit variables.
BTW - A somewhat unfortunate consequence of this is that debug and release builds may produce slightly different results: in the release build, the optimizer may keep an intermediate variable in an 80-bit FPU register, while in the debug build, it will be stored in a 64-bit variable, causing loss of precision. You can avoid this by using 80-bit variables, or use an FPU switch (or compiler option) to perform all calculations in 64 bit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With