Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Double precision is different in different languages

I'm experimenting with the precision of a double value in various programming languages.

My programs

main.c

#include <stdio.h>

int main() {
    for (double i = 0.0; i < 3; i = i + 0.1) {
        printf("%.17lf\n", i);
    }
    return 0;
}

main.cpp

#include <iostream>

using namespace std;

int main() {
    cout.precision(17);
    for (double i = 0.0; i < 3; i = i + 0.1) {
        cout << fixed << i << endl;
    }
    return 0;
}

main.py

i = 0.0
while i < 3:
    print(i)
    i = i + 0.1

Main.java

public class Main {
    public static void main(String[] args) {
        for (double i = 0.0; i < 3; i = i + 0.1) {
            System.out.println(i);
        }
    }
}

The output

main.c

0.00000000000000000
0.10000000000000001
0.20000000000000001
0.30000000000000004
0.40000000000000002
0.50000000000000000
0.59999999999999998
0.69999999999999996
0.79999999999999993
0.89999999999999991
0.99999999999999989
1.09999999999999990
1.20000000000000000
1.30000000000000000
1.40000000000000010
1.50000000000000020
1.60000000000000030
1.70000000000000040
1.80000000000000050
1.90000000000000060
2.00000000000000040
2.10000000000000050
2.20000000000000060
2.30000000000000070
2.40000000000000080
2.50000000000000090
2.60000000000000100
2.70000000000000110
2.80000000000000120
2.90000000000000120

main.cpp

0.00000000000000000
0.10000000000000001
0.20000000000000001
0.30000000000000004
0.40000000000000002
0.50000000000000000
0.59999999999999998
0.69999999999999996
0.79999999999999993
0.89999999999999991
0.99999999999999989
1.09999999999999987
1.19999999999999996
1.30000000000000004
1.40000000000000013
1.50000000000000022
1.60000000000000031
1.70000000000000040
1.80000000000000049
1.90000000000000058
2.00000000000000044
2.10000000000000053
2.20000000000000062
2.30000000000000071
2.40000000000000080
2.50000000000000089
2.60000000000000098
2.70000000000000107
2.80000000000000115
2.90000000000000124

main.py

0.0
0.1
0.2
0.30000000000000004
0.4
0.5
0.6
0.7
0.7999999999999999
0.8999999999999999
0.9999999999999999
1.0999999999999999
1.2
1.3
1.4000000000000001
1.5000000000000002
1.6000000000000003
1.7000000000000004
1.8000000000000005
1.9000000000000006
2.0000000000000004
2.1000000000000005
2.2000000000000006
2.3000000000000007
2.400000000000001
2.500000000000001
2.600000000000001
2.700000000000001
2.800000000000001
2.9000000000000012

Main.java

0.0
0.1
0.2
0.30000000000000004
0.4
0.5
0.6
0.7
0.7999999999999999
0.8999999999999999
0.9999999999999999
1.0999999999999999
1.2
1.3
1.4000000000000001
1.5000000000000002
1.6000000000000003
1.7000000000000004
1.8000000000000005
1.9000000000000006
2.0000000000000004
2.1000000000000005
2.2000000000000006
2.3000000000000007
2.400000000000001
2.500000000000001
2.600000000000001
2.700000000000001
2.800000000000001
2.9000000000000012

My question

I know that there are some errors in the double type itself about which we can learn more from blogs like Why You Should Never Use Float and Double for Monetary Calculations and What Every Computer Scientist Should Know About Floating-Point Arithmetic.

But these errors are not random! Every time the errors are the same, thus my question is why are these different for different programming languages?

Secondly, why are the precision errors in Java and Python same? [Java's JVM is written in C++ whereas the python interpreter is written in C]

But surprisingly their errors are same, but different from the errors in C and C++. Why is this happening?

like image 843
Jaysmito Mukherjee Avatar asked Jan 15 '21 18:01

Jaysmito Mukherjee


People also ask

What is the difference between single precision and double precision?

Single Precision: Single Precision is a format proposed by IEEE for representation of floating-point number. It occupies 32 bits in computer memory. 2. Double Precision: Double Precision is also a format given by IEEE for representation of floating-point number. It occupies 64 bits in computer memory.

What is double precision in computer?

Double Precision is also a format given by IEEE for representation of floating-point number. It occupies 64 bits in computer memory. Difference between Single and Double Precision: Please refer Floating Point Representation for details.

What is the range of numbers in double precision?

Range of numbers in double precision: 2^ (-1022) to 2^ (+1023). Double precision is used where precision matters more. In single precision, 32 bits are used to represent floating-point number. In double precision, 64 bits are used to represent floating-point number.

What is the double precision of floating point?

Double Precision is also a format given by IEEE for representation of floating-point number. It occupies 64 bits in computer memory. In single precision, 32 bits are used to represent floating-point number. In double precision, 64 bits are used to represent floating-point number.


4 Answers

The differences in output are due to differences in converting the floating-point number to a numeral. (By numeral, I mean a character string or other text that represents a number. “20”, “20.0”, “2e+1”, and “2•102” are different numerals for the same number.)

For reference, I show the exact values of i in notes below.

In C, the %.17lf conversion specification you use requested 17 digits after the decimal point, so 17 digits after the decimal point are produced. However, the C standard allows some slack in this. It only requires calculation of enough digits that the actual internal value can be distinguished.1 The rest can be filled in with zeros (or other “incorrect” digits). It appears the C standard library you are using only fully calculates 17 significant digits and fills the rest you request with zeros. This explains why you got “2.90000000000000120” instead of “2.90000000000000124”. (Note that “2.90000000000000120” has 18 digits: 1 before the decimal point, 16 significant digits after it, and 1 non-significant “0”. “0.10000000000000001” has an aesthetic “0” before the decimal point and 17 significant digits after it. The requirement for 17 significant digits is why ““0.10000000000000001” must have the “1” at the end but “2.90000000000000120” may have a “0”.)

In contrast, it appears your C++ standard library does the full calculations, or at least more (which may be due to a rule in the C++ standard2), so you get “2.90000000000000124”.

Python 3.1 added an algorithm to convert with the same result as Java (see below). Prior to that was lax about the conversion for display. (To my knowledge, it is still lax about the floating-point format used and conformance to IEEE-754 in arithmetic operations; specific Python implementations may differ in behavior.)

Java requires that the default conversion from double to string produce just as many digits as are required to distinguish the number from neighboring double values (also here). So it produces “.2” instead of “0.20000000000000001” because the the double nearest .2 is the value that i had in that iteration. In contrast, in the next iteration, the rounding errors in arithmetic gave i a value slightly different from the double nearest .3, so Java produced “0.30000000000000004” for it. In the next iteration, the new rounding error happened to partially cancel the accumulated error, so it was back to “0.4”.

Notes

The exact values of i when IEEE-754 binary64 is used are:

0
0.1000000000000000055511151231257827021181583404541015625
0.200000000000000011102230246251565404236316680908203125
0.3000000000000000444089209850062616169452667236328125
0.40000000000000002220446049250313080847263336181640625
0.5
0.59999999999999997779553950749686919152736663818359375
0.6999999999999999555910790149937383830547332763671875
0.79999999999999993338661852249060757458209991455078125
0.899999999999999911182158029987476766109466552734375
0.99999999999999988897769753748434595763683319091796875
1.0999999999999998667732370449812151491641998291015625
1.1999999999999999555910790149937383830547332763671875
1.3000000000000000444089209850062616169452667236328125
1.4000000000000001332267629550187848508358001708984375
1.5000000000000002220446049250313080847263336181640625
1.6000000000000003108624468950438313186168670654296875
1.7000000000000003996802888650563545525074005126953125
1.8000000000000004884981308350688777863979339599609375
1.9000000000000005773159728050814010202884674072265625
2.000000000000000444089209850062616169452667236328125
2.10000000000000053290705182007513940334320068359375
2.200000000000000621724893790087662637233734130859375
2.300000000000000710542735760100185871124267578125
2.400000000000000799360577730112709105014801025390625
2.50000000000000088817841970012523233890533447265625
2.600000000000000976996261670137755572795867919921875
2.7000000000000010658141036401502788066864013671875
2.800000000000001154631945610162802040576934814453125
2.90000000000000124344978758017532527446746826171875

These are not all the same values you would get by converting 0, .1, .2, .3,… 2.9 from decimal to binary64 because they are produced by arithmetic, so there are multiple rounding errors from the initial conversions and the consecutive additions.

Footnotes

1 C 2018 7.21.6.1 only requires that the resulting numeral be accurate to DECIMAL_DIG digits in a specified sense. DECIMAL_DIG is the number of digits such that, for any number in any floating-point format in the implementation, converting it to a decimal number with DECIMAL_DIG significant digits and then back to floating-point yields the original value. If IEEE-754 binary64 is the most precise format your implementation supports, then its DECIMAL_DIG is at least 17.

2 I do not see such a rule in the C++ standard, other than incorporation of the C standard, so it may be that your C++ library is simply using a different method from your C library as a matter of choice.

like image 78
Eric Postpischil Avatar answered Oct 30 '22 20:10

Eric Postpischil


The differences you're seeing are in how you print out the data, not in the data itself.

As I see it, we have two problems here. One is that you're not consistently specifying the same precision when you print out the data in each language.

The second is that you're printing the data out to 17 digits of precision, but at least as normally implemented (double being a 64-bit number with a 53-bit significand) a double really only has about 15 decimal digits of precision.

So, while (for example) C and C++ both require that your result be rounded "correctly", once you go beyond the limits of precision it's supposed to support, they can't guarantee much about producing truly identical results in every possible case.

But that's going to affect only how the result looks when you print it out, not how it's actually stored internally.

like image 21
Jerry Coffin Avatar answered Oct 30 '22 21:10

Jerry Coffin


I don't know about Python or Java but neither C and C++ insist that the printed decimal representation of a double value be as precise or concise as possible. So comparing printed decimal representations does not tell you everything about the actual value that is being printed. Two values could be the same in the binary representation but still legitimately print as different decimal strings in different languages (or different implementations of the same language).

Therefore your lists of printed values are not telling you that anything unusual is going on.

What you should do instead is print the exact binary representations of your double values.

Some useful reading. https://www.exploringbinary.com/

like image 20
john Avatar answered Oct 30 '22 19:10

john


But these errors are not random!

Correct. That should be expected.

why are these different for different programming language?

Because you've formatted the output differently.

Why are the errors in Java and Python same?

They seem to have the same, or sufficiently similar default formatting.

like image 36
eerorika Avatar answered Oct 30 '22 19:10

eerorika