My question is whether all integer values are guaranteed to have a perfect double representation.
Consider the following code sample that prints "Same":
// Example program #include <iostream> #include <string> int main() { int a = 3; int b = 4; double d_a(a); double d_b(b); double int_sum = a + b; double d_sum = d_a + d_b; if (double(int_sum) == d_sum) { std::cout << "Same" << std::endl; } }
Is this guaranteed to be true for any architecture, any compiler, any values of a
and b
? Will any integer i
converted to double
, always be represented as i.0000000000000
and not, for example, as i.000000000001
?
I tried it for some other numbers and it always was true, but was unable to find anything about whether this is coincidence or by design.
Note: This is different from this question (aside from the language) since I am adding the two integers.
For a double, you're merely increasing the number of bits that it can store... in fact, it's called double precision so any number that can be shown as a float is capable of being shown as a double. Extra 0's are merely added to the mantissa.
Many compilers of programming languages use this standard to store and perform mathematical operations. This code may be justified when it is executed on a 32-bit system because the type double has 52 significant bits and can store a 32-bit integer value without loss.
The double is a fundamental data type built into the compiler and used to define numeric variables holding numbers with decimal points. C, C++, C# and many other programming languages recognize the double as a type. A double type can represent fractional as well as whole values.
Real numbers are represented in C by the floating point types float, double, and long double. Just as the integer types can't represent all integers because they fit in a bounded number of bytes, so also the floating-point types can't represent all real numbers.
Disclaimer (as suggested by Toby Speight): Although IEEE 754 representations are quite common, an implementation is permitted to use any other representation that satisfies the requirements of the language.
The doubles are represented in the form mantissa * 2^exponent
, i.e. some of the bits are used for the non-integer part of the double number.
bits range precision float 32 1.5E-45 .. 3.4E38 7- 8 digits double 64 5.0E-324 .. 1.7E308 15-16 digits long double 80 1.9E-4951 .. 1.1E4932 19-20 digits
The part in the fraction can also used to represent an integer by using an exponent which removes all the digits after the dot.
E.g. 2,9979 · 10^4 = 29979.
Since a common int
is usually 32 bit you can represent all int
s as double, but for 64 bit integers of course this is no longer true. To be more precise (as LThode noted in a comment): IEEE 754 double-precision can guarantee this for up to 53 bits (52 bits of significand + the implicit leading 1 bit).
Answer: yes for 32 bit ints, no for 64 bit ints.
(This is correct for server/desktop general-purpose CPU environments, but other architectures may behave differently.)
Practical Answer as Malcom McLean puts it: 64 bit doubles are an adequate integer type for almost all integers that are likely to count things in real life.
For the empirically inclined, try this:
#include <iostream> #include <limits> using namespace std; int main() { double test; volatile int test_int; for(int i=0; i< std::numeric_limits<int>::max(); i++) { test = i; test_int = test; // compare int with int: if (test_int != i) std::cout<<"found integer i="<<i<<", test="<<test<<std::endl; } return 0; }
Success time: 0.85 memory: 15240 signal:0
Subquestion: Regarding the question for fractional differences. Is it possible to have a integer which converts to a double which is just off the correct value by a fraction, but which converts back to the same integer due to rounding?
The answer is no, because any integer which converts back and forth to the same value, actually represents the same integer value in double. For me the simplemost explanation (suggested by ilkkachu) for this is that using the exponent 2^exponent
the step width must always be a power of two. Therefore, beyond the largest 52(+1 sign) bit integer, there are never two double values with a distance smaller than 2, which solves the rounding issue.
No. Suppose you have a 64-bit integer type and a 64-bit floating-point type (which is typical for a double
). There are 2^64 possible values for that integer type and there are 2^64 possible values for that floating-point type. But some of those floating-point values (in fact, most of them) do not represent integer values, so the floating-point type can represent fewer integer values than the integer type can.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With