Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently checking if arbitrary object is NaN in Python / numpy / pandas?

People also ask

How do I check if an object is a NaN panda?

isna() in pandas library can be used to check if the value is null/NaN. It will return True if the value is NaN/null.

How do I check if an element is NumPy NaN?

Numpy module in python, provides a function numpy. isnan(), to check if an element is NaN or not. The isnan() method will take a array as an input and returns a boolean array of same size. The values in boolean array represent that if the element at that corresponding position in original array is a NaN or not.

How do I check if a string is NaN pandas?

Using pd.na() function The isna() is a pandas function that can check if the value is NaN.

Is NaN in Python NumPy?

In Python, NumPy with the latest version where nan is a value only for floating arrays only which stands for not a number and is a numeric data type which is used to represent an undefined value. In Python, NumPy defines NaN as a constant value.


pandas.isnull() (also pd.isna(), in newer versions) checks for missing values in both numeric and string/object arrays. From the documentation, it checks for:

NaN in numeric arrays, None/NaN in object arrays

Quick example:

import pandas as pd
import numpy as np
s = pd.Series(['apple', np.nan, 'banana'])
pd.isnull(s)
Out[9]: 
0    False
1     True
2    False
dtype: bool

The idea of using numpy.nan to represent missing values is something that pandas introduced, which is why pandas has the tools to deal with it.

Datetimes too (if you use pd.NaT you won't need to specify the dtype)

In [24]: s = Series([Timestamp('20130101'),np.nan,Timestamp('20130102 9:30')],dtype='M8[ns]')

In [25]: s
Out[25]: 
0   2013-01-01 00:00:00
1                   NaT
2   2013-01-02 09:30:00
dtype: datetime64[ns]``

In [26]: pd.isnull(s)
Out[26]: 
0    False
1     True
2    False
dtype: bool

Is your type really arbitrary? If you know it is just going to be a int float or string you could just do

 if val.dtype == float and np.isnan(val):

assuming it is wrapped in numpy , it will always have a dtype and only float and complex can be NaN


I found this brilliant solution here, it uses the simple logic NAN!=NAN. https://www.codespeedy.com/check-if-a-given-string-is-nan-in-python/

Using above example you can simply do the following. This should work on different type of objects as it simply utilize the fact that NAN is not equal to NAN.

 import numpy as np
 s = pd.Series(['apple', np.nan, 'banana'])
 s.apply(lambda x: x!=x)
 out[252]
 0    False
 1     True
 2    False
 dtype: bool