This question is very similar to a lot of questions related with the warning RuntimeWarning: invalid value encountered in greater/less/etc
However, I couldn't find a solution for my particular problem, and I think there should be one.
So, I have a numpy.ndarray
similar to this one:
array([[ nan, 1., nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
...,
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan],
[ nan, nan, nan, ..., nan, nan, nan]])
I want to calculate array > 0.5
, which gives exactly the result I want, but with the warning for being comparing with nan
:
__main__:1: RuntimeWarning: invalid value encountered in greater
Out[68]:
array([[False, True, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]], dtype=bool)
I basically want to calculate array > 0.5
, but without the warning showing up.
My restrictions:
with np.errstate(invalid='ignore'):
I have come up with a simple solution:
nan
in the original matrix (array[np.isnan(array)] = -np.inf
), recovering it back after I do my comparison (array[array == -np.inf] = np.nan
)But I think it is just a waste of time all these calculations when (I think) it should exist a direct way to do this at once. I have been exploring the numpy.ma
module and the numpy.where
function, but I couldn't find this "direct" solution which I want.
Any thoughts on this?
Why Is the Runtimewarning: Invalid Value Encountered in double_scalars Error Happening? The runtimewarning: invalid value encountered in double_scalars error happens when web developers try to perform certain mathematical operations that include numbers at the opposite end of the spectrum.
One warning you may encounter in Python is: RuntimeWarning: overflow encountered in exp. This warning occurs when you use the NumPy exp function, but use a value that is too large for it to handle.
There is a better way - you don't want to suppress the warning forever, because it could help you find other mistakes later on.
Following the suggestions found in this question: RuntimeWarning: invalid value encountered in divide
If the result is the one you want, you can just write:
with np.errstate(invalid='ignore'):
result = (array > 0.5)
# ... use result, and your warnings are not suppressed.
Otherwise, you could meet your restrictions by copying the array:
to_compare = array.copy()
to_compare[np.isnan(to_compare)] = 0.5 # you don't need -np.inf, anything <= 0.5 is OK
result = (to_compare > 0.5)
And you don't need to "recover" the NaNs in your array.
You would have that warning whenever an array containing at least one NaN is compared. The solution would be to use masking
to compare only the non-NaN elements and we would try to have a generic solution to cover all types of comparisons with the help of comparison based NumPy ufuncs
, as shown below -
def compare_nan_array(func, a, thresh):
out = ~np.isnan(a)
out[out] = func(a[out] , thresh)
return out
The idea being :
Get the mask of non-NaNs.
Use that to get the non-NaN values from input array. Then perform the required comparison (greater than, greater than equal to, etc.) to get another mask, which represents the compared mask output for the masked places.
Use this to refine the mask of non-NaNs and this is the final output.
Sample run -
In [41]: np.random.seed(0)
In [42]: a = np.random.randint(0,9,(4,5)).astype(float)
In [43]: a.ravel()[np.random.choice(a.size, 16, replace=0)] = np.nan
In [44]: a
Out[44]:
array([[ nan, nan, nan, nan, nan],
[ nan, nan, nan, 4., 7.],
[ nan, nan, nan, 1., nan],
[ nan, 7., nan, nan, nan]])
In [45]: a > 5 # Shows warning with the usual comparison
__main__:1: RuntimeWarning: invalid value encountered in greater
Out[45]:
array([[False, False, False, False, False],
[False, False, False, False, True],
[False, False, False, False, False],
[False, True, False, False, False]], dtype=bool)
# With suggested masking based method
In [46]: compare_nan_array(np.greater, a, 5)
Out[46]:
array([[False, False, False, False, False],
[False, False, False, False, True],
[False, False, False, False, False],
[False, True, False, False, False]], dtype=bool)
Let's test out the generic behavior by testing for lesser than 5
-
In [47]: a < 5
__main__:1: RuntimeWarning: invalid value encountered in less
Out[47]:
array([[False, False, False, False, False],
[False, False, False, True, False],
[False, False, False, True, False],
[False, False, False, False, False]], dtype=bool)
In [48]: compare_nan_array(np.less, a, 5)
Out[48]:
array([[False, False, False, False, False],
[False, False, False, True, False],
[False, False, False, True, False],
[False, False, False, False, False]], dtype=bool)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With