I have a numpy array arr. It's a numpy.ndarray, size is (5553110,), dtype=float32.
When I do:
(arr > np.pi )[3154950]
False
(arr[3154950] > np.pi )
True
Why is the first comparison getting it wrong? And how can I fix it?
The values:
arr[3154950]= 3.1415927
np.pi= 3.141592653589793
Is the problem with precision?
To check if two NumPy arrays A and B are equal: Use a comparison operator (==) to form a comparison array. Check if all the elements in the comparison array are True.
To compare each element of a NumPy array arr against the scalar x using any of the greater (>), greater equal (>=), smaller (<), smaller equal (<=), or equal (==) operators, use the broadcasting feature with the array as one operand and the scalar as another operand.
Compare Two Arrays in Python Using the numpy. array_equiv() Method. The numpy. array_equiv(a1, a2) method takes array a1 and a2 as input and returns True if both arrays' shape and elements are the same; otherwise, returns False .
The numpy. array_equiv() function can also be used to check whether two arrays are equal or not in Python. The numpy. array_equiv() function returns True if both arrays have the same shape and all the elements are equal, and returns False otherwise.
The problem is due to accuracy of np.float32 vs np.float64.
Use np.float64 and you will not see a problem:
import numpy as np
arr = np.array([3.1415927], dtype=np.float64)
print((arr > np.pi)[0])  # True
print(arr[0] > np.pi)    # True
As @WarrenWeckesser comments:
It involves how numpy decides to cast the arguments of its operations. Apparently, with
arr > scalar, the scalar is converted to the same type as the arrayarr, which in this case isnp.float32. On the other hand, with something likearr > arr2, with both arguments nonscalar arrays, they will use a common data type. That's why (arr > np.array([np.pi]))[3154950]returnsTrue.
Related github issue
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With