I am trying to find all NaNs and empty strings (i.e "") in a Python list of strings. Please see the following code with 3 options:
names=['Pat','Sam', np.nan, 'Tom', '']
for idx,name in enumerate(names):
    if name=='':        #Option 1 
    if pd.isnull(name): #Option 2
    if np.isnan(name):  #Option 3 
        print(idx)
Option 1: This check, name=="", doesn't catch NaN
Option 2: This check, pd.isnull(name) doesn't catch the empty string
Option 3: This check, np.isnan(name) gives the following error on the strings (e.g "Pat"):
----> 6 if np.isnan(name): TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe*''*
Question: Is there any function/method that can check for empty strings, NaNs and does not give an error when it encounters a string?
The math. isnan() method checks whether a value is NaN (Not a Number), or not. This method returns True if the specified value is a NaN, otherwise it returns False.
NaN stands for Not A Number and is one of the common ways to represent the missing data value in Python/Pandas DataFrame. Sometimes we would be required to convert/replace any missing values with the values that make sense like replacing with zero's for numeric columns and blank or empty for string-type columns.
We can check if a string is NaN by using the property of NaN object that a NaN != NaN. Let us define a boolean function isNaN() which returns true if the given argument is a NaN and returns false otherwise. We can also take a value and convert it to float to check whether it is NaN.
There is a way to combine options #1 and #2 and get the result you are looking for:
names = ['Pat', 'Sam', np.nan, 'Tom', '']
for idx, name in enumerate(names):
    if not name or pd.isnull(name):
        print(idx)
Just use both:
>>> names=['Pat','Sam', np.nan, 'Tom', '']
>>> for idx,name in enumerate(names):
...     if name == '' or pd.isnull(name):
...         print(idx)
...
2
4
However, realize that:
>>> pd.isnull(None)
True
So if you want to check specifically for NaN and not None, use math.isnan (while guarding against passing non-float values to math.isnan:
>>> import math
>>> for idx,name in enumerate(names):
...     if name == '' or (isinstance(name, float) and  math.isnan(name)):
...         print(idx)
...
2
4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With