I'm looking to compare two dataframes which should be identical. However due to floating point precision I am being told the values don't match. I have created an example to simulate it below. How can I get the correct result so the final comparison dataframe returns true for both cells?
a = pd.DataFrame({'A':[100,97.35000000001]})
b = pd.DataFrame({'A':[100,97.34999999999]})
print a
A
0 100.00
1 97.35
print b
A
0 100.00
1 97.35
print (a == b)
A
0 True
1 False
The common wisdom that floating-point numbers cannot be compared for equality is inaccurate. Floating-point numbers are no different from integers: If you evaluate "a == b", you will get true if they are identical numbers and false otherwise (with the understanding that two NaNs are of course not identical numbers).
How To Compare Floats in Python. If abs(a - b) is smaller than some percentage of the larger of a or b , then a is considered sufficiently close to b to be "equal" to b . This percentage is called the relative tolerance. You can specify the relative tolerance with the rel_tol keyword argument of math.
And for custom functions, you would have to pass in the additional arguments with the args parameter. Series Map: We could also choose to map the function over each element within the Pandas Series. This is actually somewhat faster than Series Apply, but still relatively slow.
OK you can use np.isclose
for this:
In [250]:
np.isclose(a,b)
Out[250]:
array([[ True],
[ True]], dtype=bool)
np.isclose
takes relative tolerance and absolute tolerance. These have default values: rtol=1e-05
, atol=1e-08
respectively
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With