Could someone give me an explanation why I get two different numbers, resp. 14 and 15, as an output from the following code?
#include <stdio.h> int main() { double Vmax = 2.9; double Vmin = 1.4; double step = 0.1; double a =(Vmax-Vmin)/step; int b = (Vmax-Vmin)/step; int c = a; printf("%d %d",b,c); // 14 15, why? return 0; }
I expect to get 15 in both cases but it seems I'm missing some fundamentals of the language.
I am not sure if it's relevant but I was doing the test in CodeBlocks. However, if I type the same lines of code in some on-line compiler ( this one for example) I get an answer of 15 for the two printed variables.
Double precision is an inexact, variable-precision numeric type. In other words, some values cannot be represented exactly and are stored as approximations. Thus, input and output operations involving double precision might show slight discrepancies.
Double precision provides greater range (approximately 10**(-308) to 10**308) and precision (about 15 decimal digits) than single precision (approximate range 10**(-38) to 10**38, with about 7 decimal digits of precision).
Double precision means the numbers takes twice the word-length to store. On a 32-bit processor, the words are all 32 bits, so doubles are 64 bits.
The C++ double should have a floating-point precision of up to 15 digits as it contains a precision that is twice the precision of the float data type. When you declare a variable as double, you should initialize it with a decimal value. For example, 3.0 is a decimal number.
... why I get two different numbers ...
Aside from the usual float-point issues, the computation paths to b
and c
are arrived in different ways. c
is calculated by first saving the value as double a
.
double a =(Vmax-Vmin)/step; int b = (Vmax-Vmin)/step; int c = a;
C allows intermediate floating-point math to be computed using wider types. Check the value of FLT_EVAL_METHOD
from <float.h>
.
Except for assignment and cast (which remove all extra range and precision), ...
-1 indeterminable;
0 evaluate all operations and constants just to the range and precision of the type;
1 evaluate operations and constants of type
float
anddouble
to the range and precision of thedouble
type, evaluatelong double
operations and constants to the range and precision of thelong double
type;2 evaluate all operations and constants to the range and precision of the
long double
type.C11dr §5.2.4.2.2 9
OP reported 2
By saving the quotient in double a = (Vmax-Vmin)/step;
, precision is forced to double
whereas int b = (Vmax-Vmin)/step;
could compute as long double
.
This subtle difference results from (Vmax-Vmin)/step
(computed perhaps as long double
) being saved as a double
versus remaining a long double
. One as 15 (or just above), and the other just under 15. int
truncation amplifies this difference to 15 and 14.
On another compiler, the results may both have been the same due to FLT_EVAL_METHOD < 2
or other floating-point characteristics.
Conversion to int
from a floating-point number is severe with numbers near a whole number. Often better to round()
or lround()
. The best solution is situation dependent.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With