I am attempting to compare two DataFrames with pandas testing assert_frame_equal
. These frames contain floats that I want to compare to some user defined precision.
The check_less_precise
argument from assert_frame_equal
seems to suggest that I can specify the number of digits after the decimal point to compare. To quote the API Reference page -
check_less_precise
: Specify comparison precision. Only used when check_exact is False. 5 digits (False) or 3 digits (True) after decimal points are compared. If int, then specify the digits to compare
API Reference
However, This doesn't seem to work when the floats are less than 1.
This raises an AssertionError
import pandas as pd
expected = pd.DataFrame([{"col": 0.1}])
output = pd.DataFrame([{"col": 0.12}])
pd.testing.assert_frame_equal(expected, output, check_less_precise=1)
while this does not
expected = pd.DataFrame([{"col": 1.1}])
output = pd.DataFrame([{"col": 1.12}])
pd.testing.assert_frame_equal(expected, output, check_less_precise=1)
can someone help explain this behavior, is this a bug?
check_less_precise
works more like relative tolerance. See details below.
I dug through the source code and found out what is happening. Eventually the function decimal_almost_equal
gets called which looks like this in normal Python (its in Cython).
def decimal_almost_equal(desired, actual, decimal):
return abs(desired - actual) < (0.5 * 10.0 ** -decimal)
See the source code here Here is actual call to the function:
decimal_almost_equal(1, fb / fa, decimal)
Where in this example
fa = .1
fb = .12
decimal = 1
So the function call becomes
decimal_almost_equal(1, 1.2, 1)
Which decimal_almost_equal
evaluates as
abs(1 - 1.2) < .5 * 10 ** -1
Or
.2 < .05
Which is False
.
So the comparison is based on percentage difference and not total difference it seems.
If you want an absolute comparison, check out np.allclose
.
np.allclose(expected, output, atol=.1)
True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With