I can compare two Pandas series for exact equality using pandas.Series.equals
. Is there a corresponding function or parameter that will check if the elements are equal to some ε of precision?
In the case of floating-point numbers, the relational operator (==) does not produce correct output, this is due to the internal precision errors in rounding up floating-point numbers. In the above example, we can see the inaccuracy in comparing two floating-point numbers using “==” operator.
In the pandas series constructor, there is a method called gt() which is used to apply the Greater Than condition between elements of two pandas series objects. The result of the gt() method is based on the comparison between elements of two series objects.
How To Compare Floats in Python. If abs(a - b) is smaller than some percentage of the larger of a or b , then a is considered sufficiently close to b to be "equal" to b . This percentage is called the relative tolerance. You can specify the relative tolerance with the rel_tol keyword argument of math.
Relative Comparison of Floating-point ValuesIf a and b differ in sign then returns the largest representable value for T. If both a and b are both infinities (of the same sign), then returns zero. If just one of a and b is an infinity, then returns the largest representable value for T.
You can use numpy.allclose
:
numpy.allclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)
Returns
True
if two arrays are element-wise equal within a tolerance.The tolerance values are positive, typically very small numbers. The relative difference (
rtol * abs(b)
) and the absolute differenceatol
are added together to compare against the absolute difference betweena
andb
.
numpy
works well with pandas.Series
objects, so if you have two of them - s1
and s2
, you can simply do:
np.allclose(s1, s2, atol=...)
Where atol
is your tolerance value.
Numpy works well with pandas Series. However one has to be careful with the order of indices (or columns and indices for pandas DataFrame)
For example
series_1 = pd.Series(data=[0,1], index=['a','b'])
series_2 = pd.Series(data=[1,0], index=['b','a'])
np.allclose(series_1,series_2)
will return False
A workaround is to use the index of one pandas series
np.allclose(series_1, series_2.loc[series_1.index])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With