How to get rid of "RuntimeWarning: invalid value encountered in greater"

Tags:

numpy

This question is very similar to a lot of questions related with the warning RuntimeWarning: invalid value encountered in greater/less/etc

However, I couldn't find a solution for my particular problem, and I think there should be one.

So, I have a numpy.ndarray similar to this one:

array([[ nan,   1.,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       ..., 
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan],
       [ nan,  nan,  nan, ...,  nan,  nan,  nan]])

I want to calculate array > 0.5, which gives exactly the result I want, but with the warning for being comparing with nan:

__main__:1: RuntimeWarning: invalid value encountered in greater
Out[68]: 
array([[False,  True, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ..., 
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]], dtype=bool)

I basically want to calculate array > 0.5, but without the warning showing up.

My restrictions:

I do NOT want to just suppress the warning with with np.errstate(invalid='ignore'):
I need to maintain the original array, thus I cannot change it.

I have come up with a simple solution:

Change the nan in the original matrix (array[np.isnan(array)] = -np.inf), recovering it back after I do my comparison (array[array == -np.inf] = np.nan)

But I think it is just a waste of time all these calculations when (I think) it should exist a direct way to do this at once. I have been exploring the numpy.ma module and the numpy.where function, but I couldn't find this "direct" solution which I want.

Any thoughts on this?

880

asked Nov 16 '17 22:11

tjiagoM

2 Answers

There is a better way - you don't want to suppress the warning forever, because it could help you find other mistakes later on.

Following the suggestions found in this question: RuntimeWarning: invalid value encountered in divide

The Right Way:

If the result is the one you want, you can just write:

with np.errstate(invalid='ignore'):
    result = (array > 0.5)

# ... use result, and your warnings are not suppressed.

A different Wrong Way:

Otherwise, you could meet your restrictions by copying the array:

to_compare = array.copy()
to_compare[np.isnan(to_compare)] = 0.5  # you don't need -np.inf, anything <= 0.5 is OK
result = (to_compare > 0.5)

And you don't need to "recover" the NaNs in your array.

answered Oct 21 '22 12:10

Tomasz Gandor

You would have that warning whenever an array containing at least one NaN is compared. The solution would be to use masking to compare only the non-NaN elements and we would try to have a generic solution to cover all types of comparisons with the help of comparison based NumPy ufuncs, as shown below -

def compare_nan_array(func, a, thresh):
    out = ~np.isnan(a)
    out[out] = func(a[out] , thresh)
    return out

The idea being :

Get the mask of non-NaNs.
Use that to get the non-NaN values from input array. Then perform the required comparison (greater than, greater than equal to, etc.) to get another mask, which represents the compared mask output for the masked places.
Use this to refine the mask of non-NaNs and this is the final output.

Sample run -

In [41]: np.random.seed(0)

In [42]: a = np.random.randint(0,9,(4,5)).astype(float)

In [43]: a.ravel()[np.random.choice(a.size, 16, replace=0)] = np.nan

In [44]: a
Out[44]: 
array([[ nan,  nan,  nan,  nan,  nan],
       [ nan,  nan,  nan,   4.,   7.],
       [ nan,  nan,  nan,   1.,  nan],
       [ nan,   7.,  nan,  nan,  nan]])

In [45]: a > 5  # Shows warning with the usual comparison
__main__:1: RuntimeWarning: invalid value encountered in greater
Out[45]: 
array([[False, False, False, False, False],
       [False, False, False, False,  True],
       [False, False, False, False, False],
       [False,  True, False, False, False]], dtype=bool)

# With suggested masking based method
In [46]: compare_nan_array(np.greater, a, 5)
Out[46]: 
array([[False, False, False, False, False],
       [False, False, False, False,  True],
       [False, False, False, False, False],
       [False,  True, False, False, False]], dtype=bool)

Let's test out the generic behavior by testing for lesser than 5 -

In [47]: a < 5
__main__:1: RuntimeWarning: invalid value encountered in less
Out[47]: 
array([[False, False, False, False, False],
       [False, False, False,  True, False],
       [False, False, False,  True, False],
       [False, False, False, False, False]], dtype=bool)

In [48]: compare_nan_array(np.less, a, 5)
Out[48]: 
array([[False, False, False, False, False],
       [False, False, False,  True, False],
       [False, False, False,  True, False],
       [False, False, False, False, False]], dtype=bool)

answered Oct 21 '22 13:10

Divakar

Related questions
                            
                                How to use tqdm through multi process in python?
                            
                                ValueError: Length mismatch: Expected axis has 0 elements while creating hierarchical columns in pandas dataframe
                            
                                How to get image from ImageDraw in PIL?
                            
                                Configure lru_cache for class and static methods
                            
                                How are Counter / defaultdict ordered in Python 3.7?
                            
                                Engines in Python Pandas read_csv
                            
                                Installing pwntools on macOS
                            
                                Pyspark error: Java gateway process exited before sending its port number
                            
                                Python: Spacy and memory consumption
                            
                                Change background map for contextily
                            
                                Python Typing: declare return value type based on function argument
                            
                                Python setuptools: package directory does not exist
                            
                                why '2'<'1'== False output False in python3? [duplicate]
                            
                                Displaying PDF files with python3
                            
                                How to specify in YAML to always create log file in the project's folder using dictConfig?
                            
                                Web2py and python 3
                            
                                Random int without importing 'random'
                            
                                Installing OpenCV 3 for Python 3 on a mac using Homebrew and pyenv
                            
                                Saving Numpy Structure Array to *.mat file
                            
                                Python and Selenium - get text excluding child node's text

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With