Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a particular cell in pandas DataFrame isnull?

I have the following df in pandas.

0       A     B     C
1       2   NaN     8

How can I check if df.iloc[1]['B'] is NaN?

I tried using df.isnan() and I get a table like this:

0       A     B      C
1   false  true  false

but I am not sure how to index the table and if this is an efficient way of performing the job at all?

like image 220
Newskooler Avatar asked Mar 21 '17 08:03

Newskooler


People also ask

How do you check if a cell in a DataFrame is null?

isnull() Method. DataFrame. isnull() check if a value is present in a cell, if it finds NaN/None values it returns True otherwise it returns False for each cell.

How do you get Isnull in pandas?

Pandas DataFrame isnull() MethodThe isnull() method returns a DataFrame object where all the values are replaced with a Boolean value True for NULL values, and otherwise False.

How do you check if a particular value in a DataFrame is NaN?

To check if value at a specific location in Pandas is NaN or not, call numpy. isnan() function with the value passed as argument. If value equals numpy. nan, the expression returns True, else it returns False.

Does Isnull check for NaN?

Detect missing values for an array-like object. This function takes a scalar or array-like object and indicates whether values are missing ( NaN in numeric arrays, None or NaN in object arrays, NaT in datetimelike).


2 Answers

Use pd.isnull, for select use loc or iloc:

print (df)
   0  A   B  C
0  1  2 NaN  8

print (df.loc[0, 'B'])
nan

a = pd.isnull(df.loc[0, 'B'])
print (a)
True

print (df['B'].iloc[0])
nan

a = pd.isnull(df['B'].iloc[0])
print (a)
True
like image 56
jezrael Avatar answered Oct 13 '22 05:10

jezrael


jezrael response is spot on. If you are only concern with NaN value, I was exploring to see if there's a faster option, since in my experience, summing flat arrays is (strangely) faster than counting. This code seems faster:

df.isnull().values.any()

For example:

In [2]: df = pd.DataFrame(np.random.randn(1000,1000))

In [3]: df[df > 0.9] = pd.np.nan

In [4]: %timeit df.isnull().any().any()
100 loops, best of 3: 14.7 ms per loop

In [5]: %timeit df.isnull().values.sum()
100 loops, best of 3: 2.15 ms per loop

In [6]: %timeit df.isnull().sum().sum()
100 loops, best of 3: 18 ms per loop

In [7]: %timeit df.isnull().values.any()
1000 loops, best of 3: 948 µs per loop
like image 28
ankur09011 Avatar answered Oct 13 '22 05:10

ankur09011