Can't find nan entries using numpy in array of strings

Tags:

Can't find nan entries using numpy in array of strings my code is:

for x in X_cat:
    if x == np.nan:
        print('Found')

I know for a fact there are 2 nan entries inn the list but the code runs without printing anything. same if I replace np.nan with 'nan' My final objective is to replace the nan with the most common string.

581

asked Sep 05 '17 13:09

Peter Lynch

4 Answers

That's because comparing anything with NaN, including NaN, is False. So even when x is np.nan, the print will not run. (In fact that used to be an acceptable way of checking if something was NaN as no other IEEE754 floating point value has that property.)

Use np.isnan(x) to check if x is NaN.

195

answered Oct 08 '22 18:10

Bathsheba

In an array of strings, you can only perform string comparisons. You have to initialize a nan in a string format.

nan_str = str_np.array([np.nan]).astype(str)[0]

And by initializing an array like you describe it :

x = np.array(['hello', np.nan, 'world', np.nan], dtype=object)

You can then replace these nan by the most common string that I assume to be mostcommonstring :

x[np.where(x.astype(str)==str_nan)]='mostcommonstring'

answered Oct 08 '22 16:10

Thibaut Loiseleur

You need to check x for NaN with np.isnan:

for x in X_cat:
    if np.isnan(x):
        print('Found')

np.nan == np.nan returns False, so direct comparison is meaningless here. Find more about isnan in numpy docs

answered Oct 08 '22 16:10

Oleh Rybalchenko

Not enough reputation to comment on Thibaut's answer, but to simplify it: The nan-string can be np.str_(np.nan) or even str(np.nan).

x = np.array(['hello', np.nan, 'world', np.nan], dtype=object)

x[np.where(x.astype(str)==str(np.nan))] = 'mostcommonstring'

answered Oct 08 '22 18:10

thomaskolasa

Related questions
                            
                                Can I move the pygame game window around the screen (pygame)
                            
                                imwrite 16 bit png depth image
                            
                                Remove anti-aliasing for pandas plot.area
                            
                                Does __init__.py have to be in every directory of python application?
                            
                                Closing python requests connection
                            
                                What's the importance of invalid fitness in DEAP?
                            
                                Pickle/dill cannot handle circular references if __hash__ is overridden
                            
                                NameError: name 'hasattr' is not defined - Python3.6, Django1.11, Ubuntu16-17, Apache2.4, mod_wsgi
                            
                                Creating and populating a dictionary in jinja2
                            
                                How to decode color mapping in matplotlib's Colormap?
                            
                                Sphinx does not recognize subfolders
                            
                                How to inverse lemmatization process given a lemma and a token?
                            
                                How to refresh the flask web page?
                            
                                How do I center the outputs on a Python Jupyter notebook?
                            
                                How to overriding model save function when using factory boy?
                            
                                Ansible write info about nodes to local csv file
                            
                                How do I handle migrations as a Django package maintainer?
                            
                                List comprehension in format string? (Python)
                            
                                How to invoke a Python method using its fully qualified name?
                            
                                How can I find the best fit subsequences of a large string?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can't find nan entries using numpy in array of strings

Tags:

python

arrays

numpy