I am trying to find all NaNs
and empty strings (i.e ""
) in a Python list of strings. Please see the following code with 3 options:
names=['Pat','Sam', np.nan, 'Tom', '']
for idx,name in enumerate(names):
if name=='': #Option 1
if pd.isnull(name): #Option 2
if np.isnan(name): #Option 3
print(idx)
Option 1: This check, name==""
, doesn't catch NaN
Option 2: This check, pd.isnull(name)
doesn't catch the empty string
Option 3: This check, np.isnan(name)
gives the following error on the strings (e.g "Pat"):
----> 6 if np.isnan(name): TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe*''*
Question: Is there any function/method that can check for empty strings, NaN
s and does not give an error when it encounters a string?
The math. isnan() method checks whether a value is NaN (Not a Number), or not. This method returns True if the specified value is a NaN, otherwise it returns False.
NaN stands for Not A Number and is one of the common ways to represent the missing data value in Python/Pandas DataFrame. Sometimes we would be required to convert/replace any missing values with the values that make sense like replacing with zero's for numeric columns and blank or empty for string-type columns.
We can check if a string is NaN by using the property of NaN object that a NaN != NaN. Let us define a boolean function isNaN() which returns true if the given argument is a NaN and returns false otherwise. We can also take a value and convert it to float to check whether it is NaN.
There is a way to combine options #1 and #2 and get the result you are looking for:
names = ['Pat', 'Sam', np.nan, 'Tom', '']
for idx, name in enumerate(names):
if not name or pd.isnull(name):
print(idx)
Just use both:
>>> names=['Pat','Sam', np.nan, 'Tom', '']
>>> for idx,name in enumerate(names):
... if name == '' or pd.isnull(name):
... print(idx)
...
2
4
However, realize that:
>>> pd.isnull(None)
True
So if you want to check specifically for NaN
and not None
, use math.isnan
(while guarding against passing non-float
values to math.isnan
:
>>> import math
>>> for idx,name in enumerate(names):
... if name == '' or (isinstance(name, float) and math.isnan(name)):
... print(idx)
...
2
4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With