I have a list of floats (actually it's a pandas Series object, if it changes anything) which looks like this:
mySeries:
...
22 16.0
23 14.0
24 12.0
25 10.0
26 3.1
...
(So elements of this Series are on the right, indices on the left.) Then I'm trying to assign the elements from this Series as keys in a dictionary, and indices as values, like this:
{ mySeries[i]: i for i in mySeries.index }
and I'm getting pretty much what I wanted, except that...
{ 6400.0: 0, 66.0: 13, 3.1000000000000001: 23, 133.0: 10, ... }
Why has 3.1
suddenly changed into 3.1000000000000001
? I guess this has something to do with the way the floating point numbers are represented (?) but why does it happen now and how do I avoid/fix it?
EDIT: Please feel free to suggest a better title for this question if it's inaccurate.
EDIT2: Ok, so it seems that it's the exact same number, just printed differently. Still, if I assign mySeries[26]
as a dictionary key and then I try to run:
myDict[mySeries[26]]
I get KeyError
. What's the best way to avoid it?
Second, a dictionary key must be of a type that is immutable. For example, you can use an integer, float, string, or Boolean as a dictionary key. However, neither a list nor another dictionary can serve as a dictionary key, because lists and dictionaries are mutable.
There are many ways to set the precision of the floating-point values. Some of them are discussed below. Using “%”:- “%” operator is used to format as well as set precision in python. This is similar to “printf” statement in C programming.
There's no problem using floats as dict keys. Just round(n, 1) them to normalise them to your keyspace.
The floating-point calculations are inaccurate because mainly the rationals are approximating that cannot be represented finitely in base 2 and in general they are approximating numbers which may not be representable in finitely many digits in any base.
The dictionary isn't changing the floating point representation of 3.1, but it is actually displaying the full precision. Your print of mySeries[26] is truncating the precision and showing an approximation.
You can prove this:
pd.set_option('precision', 20)
Then view mySeries.
0 16.00000000000000000000
1 14.00000000000000000000
2 12.00000000000000000000
3 10.00000000000000000000
4 3.10000000000000008882
dtype: float64
EDIT:
What every computer programmer should know about floating point arithmetic is always a good read.
EDIT:
Regarding the KeyError, I was not able to replicate the problem.
>> x = pd.Series([16,14,12,10,3.1])
>> a = {x[i]: i for i in x.index}
>> a[x[4]]
4
>> a.keys()
[16.0, 10.0, 3.1000000000000001, 12.0, 14.0]
>> hash(x[4])
2093862195
>> hash(a.keys()[2])
2093862195
The value is already that way in the Series:
>>> x = pd.Series([16,14,12,10,3.1])
>>> x
0 16.0
1 14.0
2 12.0
3 10.0
4 3.1
dtype: float64
>>> x.iloc[4]
3.1000000000000001
This has to do with floating point precision:
>>> np.float64(3.1)
3.1000000000000001
See Floating point precision in Python array for more information about this.
Concerning the KeyError
in your edit, I was not able to reproduce. See the below:
>>> d = {x[i]:i for i in x.index}
>>> d
{16.0: 0, 10.0: 3, 12.0: 2, 14.0: 1, 3.1000000000000001: 4}
>>> x[4]
3.1000000000000001
>>> d[x[4]]
4
My suspicion is that the KeyError
is coming from the Series
: what is mySeries[26]
returning?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With