Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What Are the Maximum Number of Base-10 Digits in the Fractional Part of a Floating Point Number

If the a floating point number could be outputted so that there was no truncation of value (say with setpercision) and the number was outputted in fixed notation (say with fixed) what is the buffer size that would be required to guarantee the entire fractional part of the floating point number could be stored in the buffer?

I'm hoping there is something in the standard, like a #define or something in numeric_limits which would tell me the maximum base-10 value place of the fractional part of a floating point type.

I asked about the maximum number of base-10 digits in the fractional part of a floating point type here: What Are the Maximum Number of Base-10 Digits in the Integral Part of a Floating Point Number

But I realize this may be more complex. For example, 1.0 / 3.0 is an infinitely repeating series of numbers. When I output that using fixed formatting I get this many places before repeating 0s:

0.333333333333333314829616256247390992939472198486328125

But I can't necessarily say that's the maximum precision, cause I don't know how many of those trailing 0s were actually represented in the floating point's fraction, and it hasn't been shifted down by a negative exponent.

I know we have min_exponent10 is that what I should be looking to for this?

like image 259
Jonathan Mee Avatar asked Oct 03 '16 15:10

Jonathan Mee


People also ask

What is the fractional component of a floating-point number?

Floating point numbers are different from integer numbers in that they contain fractional parts. Even if the number to the right of the decimal point is 0 (or decimal comma, if your locale uses commas instead of periods), it's still a fractional part of the number. Floating point numbers can be positive or negative.

How many decimals can float handle?

The float data type has only 6-7 decimal digits of precision. That means the total number of digits, not the number to the right of the decimal point.

Why can 0.1 be represented as a float?

The number 0.1 in floating-point The finite representation of 1/10 is 0.0 0011 ‾ 0.0\overline{0011} 0.00011, but it can't be represented in floating-point because we can't deal with bars in floating-point. We can represent it only in fixed digits/bits using any data type.


2 Answers

If you consider the 32 and 64 bit IEEE 754 numbers, it can be calculated as described below.

It is all about negative powers of 2. So lets see how each exponent contribute:

2^-1 = 0.5         i.e. 1 digit
2^-2 = 0.25        i.e. 2 digits
2^-3 = 0.125       i.e. 3 digits
2^-4 = 0.0625      i.e. 4 digits
....
2^-N = 0.0000..    i.e. N digits

as the base-10 numbers always end with 5, you can see that the number of base-10 digits increase by 1 when the exponent descrease by 1. So 2^(-N) will require N digits

Also notice that when adding those contributions, the number of resulting digits is determined by the smallest number. So what you need to find out is the smallest exponent that can contribute.

For 32 bit IEEE 754 you have:

Smallest exponent -126

Fraction bits 23

So the smallest exponent is -126 + -23 = -149, so the smallest contribution will come from 2^-149, i.e.

For 32 bit IEEE 754 printed in base-10 there can be 149 fractional digits

For 64 bit IEEE 754 you have:

Smallest exponent -1022

Fraction bits 52

So the smallest exponent is -1022 + -52 = -1074, so the smallest contribution will come from 2^-1074, i.e.

For 64 bit IEEE 754 printed in base-10 there can be 1074 fractional digits

like image 142
Support Ukraine Avatar answered Sep 29 '22 19:09

Support Ukraine


I'm reasonably certain the standard doesn't (and can't, without imposing other restrictions) provide a pre-defined constant to specify the number you're asking for.

Floating point is most often represented in base 2, but base 16 and base 10 are also in reasonably wide use.

In all of these cases, the only factors in the base (2 and possibly 5) are also factors of 10. As a result, we never get an infinitely repeating number when converting from them to base 10 (decimal).

The standards don't restrict floating point to such representations though. In theory, if somebody really wanted to they could use (for example) base 3 or base 7 for their floating point representation. If they did so, it would be trivial to store a number that would repeat indefinitely when converted to decimal. For example 0.1 in base 3 would represent 1/3, which repeats infinitely when converted to base 10. Although I've never heard of anybody doing it, I believe such an implementation could meet the requirements of the standard.

For a typical binary representation, min_exponent should probably be a reasonable proxy for the value you want. Unfortunately, it's probably not possible to state things much more precisely than that though.

For example, an implementation is allowed to store intermediate values to greater precision than it stores in memory, so it's possible that (for example) if you give 1.0/3.0 literally in your source code, the result could actually differ form the value produced by reading a pair of inputs at run time, entering 1 and 3, and dividing them. In the former case, the division might be carried out at compile time, so the result you printed out would be exactly the size of a double, with no extra. When you enter the two values at run time, the division would be carried out at run time, and you might get a result with higher precision.

The standard does also require that the base of the floating point be documented as std::numeric_limits<T>::radix. Based on this, you could compute an approximation of the maximum number of places after the decimal point based on radixmin_exponent, as long as the prime factors of the radix were shared with the prime factors of 10.

like image 41
Jerry Coffin Avatar answered Sep 29 '22 20:09

Jerry Coffin