Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get rid of "RuntimeWarning: invalid value encountered in greater"

This question is very similar to a lot of questions related with the warning RuntimeWarning: invalid value encountered in greater/less/etc

However, I couldn't find a solution for my particular problem, and I think there should be one.

So, I have a numpy.ndarray similar to this one:

array([[ nan,   1.,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       ..., 
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan]])

I want to calculate array > 0.5, which gives exactly the result I want, but with the warning for being comparing with nan:

__main__:1: RuntimeWarning: invalid value encountered in greater
Out[68]: 
array([[False,  True, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ..., 
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]], dtype=bool)

I basically want to calculate array > 0.5, but without the warning showing up.

My restrictions:

  • I do NOT want to just suppress the warning with with np.errstate(invalid='ignore'):
  • I need to maintain the original array, thus I cannot change it.

I have come up with a simple solution:

  • Change the nan in the original matrix (array[np.isnan(array)] = -np.inf), recovering it back after I do my comparison (array[array == -np.inf] = np.nan)

But I think it is just a waste of time all these calculations when (I think) it should exist a direct way to do this at once. I have been exploring the numpy.ma module and the numpy.where function, but I couldn't find this "direct" solution which I want.

Any thoughts on this?

like image 880
tjiagoM Avatar asked Nov 16 '17 22:11

tjiagoM


People also ask

What is Runtimewarning invalid value encountered in Double_scalars?

Why Is the Runtimewarning: Invalid Value Encountered in double_scalars Error Happening? The runtimewarning: invalid value encountered in double_scalars error happens when web developers try to perform certain mathematical operations that include numbers at the opposite end of the spectrum.

What is Runtimewarning?

One warning you may encounter in Python is: RuntimeWarning: overflow encountered in exp. This warning occurs when you use the NumPy exp function, but use a value that is too large for it to handle.


2 Answers

There is a better way - you don't want to suppress the warning forever, because it could help you find other mistakes later on.

Following the suggestions found in this question: RuntimeWarning: invalid value encountered in divide

The Right Way:

If the result is the one you want, you can just write:

with np.errstate(invalid='ignore'):
    result = (array > 0.5)

# ... use result, and your warnings are not suppressed.

A different Wrong Way:

Otherwise, you could meet your restrictions by copying the array:

to_compare = array.copy()
to_compare[np.isnan(to_compare)] = 0.5  # you don't need -np.inf, anything <= 0.5 is OK
result = (to_compare > 0.5)

And you don't need to "recover" the NaNs in your array.

like image 52
Tomasz Gandor Avatar answered Oct 21 '22 12:10

Tomasz Gandor


You would have that warning whenever an array containing at least one NaN is compared. The solution would be to use masking to compare only the non-NaN elements and we would try to have a generic solution to cover all types of comparisons with the help of comparison based NumPy ufuncs, as shown below -

def compare_nan_array(func, a, thresh):
    out = ~np.isnan(a)
    out[out] = func(a[out] , thresh)
    return out

The idea being :

  • Get the mask of non-NaNs.

  • Use that to get the non-NaN values from input array. Then perform the required comparison (greater than, greater than equal to, etc.) to get another mask, which represents the compared mask output for the masked places.

  • Use this to refine the mask of non-NaNs and this is the final output.

Sample run -

In [41]: np.random.seed(0)

In [42]: a = np.random.randint(0,9,(4,5)).astype(float)

In [43]: a.ravel()[np.random.choice(a.size, 16, replace=0)] = np.nan

In [44]: a
Out[44]: 
array([[ nan,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,   4.,   7.],
       [ nan,  nan,  nan,   1.,  nan],
       [ nan,   7.,  nan,  nan,  nan]])

In [45]: a > 5  # Shows warning with the usual comparison
__main__:1: RuntimeWarning: invalid value encountered in greater
Out[45]: 
array([[False, False, False, False, False],
       [False, False, False, False,  True],
       [False, False, False, False, False],
       [False,  True, False, False, False]], dtype=bool)

# With suggested masking based method
In [46]: compare_nan_array(np.greater, a, 5)
Out[46]: 
array([[False, False, False, False, False],
       [False, False, False, False,  True],
       [False, False, False, False, False],
       [False,  True, False, False, False]], dtype=bool)

Let's test out the generic behavior by testing for lesser than 5 -

In [47]: a < 5
__main__:1: RuntimeWarning: invalid value encountered in less
Out[47]: 
array([[False, False, False, False, False],
       [False, False, False,  True, False],
       [False, False, False,  True, False],
       [False, False, False, False, False]], dtype=bool)

In [48]: compare_nan_array(np.less, a, 5)
Out[48]: 
array([[False, False, False, False, False],
       [False, False, False,  True, False],
       [False, False, False,  True, False],
       [False, False, False, False, False]], dtype=bool)
like image 31
Divakar Avatar answered Oct 21 '22 13:10

Divakar