I have a numpy array arr
. It's a numpy.ndarray
, size is (5553110,)
, dtype=float32
.
When I do:
(arr > np.pi )[3154950]
False
(arr[3154950] > np.pi )
True
Why is the first comparison getting it wrong? And how can I fix it?
The values:
arr[3154950]= 3.1415927
np.pi= 3.141592653589793
Is the problem with precision?
To check if two NumPy arrays A and B are equal: Use a comparison operator (==) to form a comparison array. Check if all the elements in the comparison array are True.
To compare each element of a NumPy array arr against the scalar x using any of the greater (>), greater equal (>=), smaller (<), smaller equal (<=), or equal (==) operators, use the broadcasting feature with the array as one operand and the scalar as another operand.
Compare Two Arrays in Python Using the numpy. array_equiv() Method. The numpy. array_equiv(a1, a2) method takes array a1 and a2 as input and returns True if both arrays' shape and elements are the same; otherwise, returns False .
The numpy. array_equiv() function can also be used to check whether two arrays are equal or not in Python. The numpy. array_equiv() function returns True if both arrays have the same shape and all the elements are equal, and returns False otherwise.
The problem is due to accuracy of np.float32
vs np.float64
.
Use np.float64
and you will not see a problem:
import numpy as np
arr = np.array([3.1415927], dtype=np.float64)
print((arr > np.pi)[0]) # True
print(arr[0] > np.pi) # True
As @WarrenWeckesser comments:
It involves how numpy decides to cast the arguments of its operations. Apparently, with
arr > scalar
, the scalar is converted to the same type as the arrayarr
, which in this case isnp.float32
. On the other hand, with something likearr > arr2
, with both arguments nonscalar arrays, they will use a common data type. That's why (arr > np.array([np.pi]))[3154950]
returnsTrue
.
Related github issue
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With