Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

underlying data structure for float in python

Got a question regarding to the underlying data structure of float (and precision) in Python:

>>> b = 1.4 + 2.3
>>> b
3.6999999999999997

>>> c = 3.7
>>> c
3.7000000000000002

>>> print b, c
3.7  3.7

>>> b == c
False

it seems the values of b and c are machine dependent, they are the numbers that closest to the target values but not exactly the same numbers. I was supervised that we get the 'right' numbers with 'Print', and someone told me that it was because print 'lies' while Python chose to tell us the truth i.e. showing exactly what they have stored.

And my questions are:

1. How to lie? e.g. in a function we take two values and return if they are the same, how I could have a best guess if the number of decimal(precision) is unknown? like b and c mentioned above? is there a well defined algorithm to do that? I was told that every language (C/C++) will have this kind of issue if we have floating point calculation involved, but how do they 'solve' this?

2. why we cannot just store the actual number instead of storing the closest number? is it a limitation or trading for efficiency?

many thanks John

like image 619
John Avatar asked Jul 18 '11 23:07

John


People also ask

What type of data is float in Python?

The float type in Python represents the floating point number. Float is used to represent real numbers and is written with a decimal point dividing the integer and fractional parts. For example, 97.98, 32.3+e18, -32.54e100 all are floating point numbers.

What is floating in data structure?

In computer science, a float is a data type composed of a number that is not an integer, because it includes a fraction represented in decimal format.

What is data structure which data structure used by Python?

The basic Python data structures in Python include list, set, tuples, and dictionary. Each of the data structures is unique in its own way. Data structures are “containers” that organize and group data according to type. The data structures differ based on mutability and order.

How do you float data in Python?

If it exceeds or exceeds the max value, Python returns an error with string inf (infinity). Syntax: The syntax for the float() method is float([x]). Here the x is an optional parameter and can be either a number or a string.


1 Answers

For the answer to your first question, take a look at the following (slightly condensed) code from Python's source:

#define PREC_REPR       17
#define PREC_STR        12

void PyFloat_AsString(char *buf, PyFloatObject *v) {
    format_float(buf, 100, v, PREC_STR);
}

void PyFloat_AsReprString(char *buf, PyFloatObject *v) {
    format_float(buf, 100, v, PREC_REPR);
}

So basically, repr(float) will return a string formatted with 17 digits of precision, and str(float) will return a string with 12 digits of precision. As you might have guessed, print uses str() and entering the variable name in the interpreter uses repr(). With only 12 digits of precision, it looks like you get the "correct" answer, but that is just because what you expect and the actual value are the same up to 12 digits.

Here is a quick example of the difference:

>>> str(.1234567890123)
'0.123456789012'
>>> repr(.1234567890123)
'0.12345678901230001'

As for your second question, I suggest you read the following section of the Python tutorial: Floating Point Arithmetic: Issues and Limitations

It boils down to efficiency, less memory and quicker floating point operations when you are storing base 10 decimals in base 2 than any other representation, but you do need to deal with the imprecision.

As JBernardo pointed out in comments, this behavior is different in Python 2.7 and above, the following quote from the above tutorial link describes the difference (using 0.1 as an example):

In versions prior to Python 2.7 and Python 3.1, Python rounded this value to 17 significant digits, giving ‘0.10000000000000001’. In current versions, Python displays a value based on the shortest decimal fraction that rounds correctly back to the true binary value, resulting simply in ‘0.1’.

like image 61
Andrew Clark Avatar answered Nov 09 '22 23:11

Andrew Clark