Got a question regarding to the underlying data structure of float (and precision) in Python:
>>> b = 1.4 + 2.3
>>> b
3.6999999999999997
>>> c = 3.7
>>> c
3.7000000000000002
>>> print b, c
3.7 3.7
>>> b == c
False
it seems the values of b and c are machine dependent, they are the numbers that closest to the target values but not exactly the same numbers. I was supervised that we get the 'right' numbers with 'Print', and someone told me that it was because print 'lies' while Python chose to tell us the truth i.e. showing exactly what they have stored.
And my questions are:
1. How to lie? e.g. in a function we take two values and return if they are the same, how I could have a best guess if the number of decimal(precision) is unknown? like b and c mentioned above? is there a well defined algorithm to do that? I was told that every language (C/C++) will have this kind of issue if we have floating point calculation involved, but how do they 'solve' this?
2. why we cannot just store the actual number instead of storing the closest number? is it a limitation or trading for efficiency?
many thanks John
The float type in Python represents the floating point number. Float is used to represent real numbers and is written with a decimal point dividing the integer and fractional parts. For example, 97.98, 32.3+e18, -32.54e100 all are floating point numbers.
In computer science, a float is a data type composed of a number that is not an integer, because it includes a fraction represented in decimal format.
The basic Python data structures in Python include list, set, tuples, and dictionary. Each of the data structures is unique in its own way. Data structures are “containers” that organize and group data according to type. The data structures differ based on mutability and order.
If it exceeds or exceeds the max value, Python returns an error with string inf (infinity). Syntax: The syntax for the float() method is float([x]). Here the x is an optional parameter and can be either a number or a string.
For the answer to your first question, take a look at the following (slightly condensed) code from Python's source:
#define PREC_REPR 17
#define PREC_STR 12
void PyFloat_AsString(char *buf, PyFloatObject *v) {
format_float(buf, 100, v, PREC_STR);
}
void PyFloat_AsReprString(char *buf, PyFloatObject *v) {
format_float(buf, 100, v, PREC_REPR);
}
So basically, repr(float)
will return a string formatted with 17 digits of precision, and str(float)
will return a string with 12 digits of precision. As you might have guessed, print
uses str()
and entering the variable name in the interpreter uses repr()
. With only 12 digits of precision, it looks like you get the "correct" answer, but that is just because what you expect and the actual value are the same up to 12 digits.
Here is a quick example of the difference:
>>> str(.1234567890123)
'0.123456789012'
>>> repr(.1234567890123)
'0.12345678901230001'
As for your second question, I suggest you read the following section of the Python tutorial: Floating Point Arithmetic: Issues and Limitations
It boils down to efficiency, less memory and quicker floating point operations when you are storing base 10 decimals in base 2 than any other representation, but you do need to deal with the imprecision.
As JBernardo pointed out in comments, this behavior is different in Python 2.7 and above, the following quote from the above tutorial link describes the difference (using 0.1
as an example):
In versions prior to Python 2.7 and Python 3.1, Python rounded this value to 17 significant digits, giving ‘0.10000000000000001’. In current versions, Python displays a value based on the shortest decimal fraction that rounds correctly back to the true binary value, resulting simply in ‘0.1’.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With