Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count the NaN values in a column in pandas DataFrame

I want to find the number of NaN in each column of my data so that I can drop a column if it has fewer NaN than some threshold. I looked but wasn't able to find any function for this. value_counts is too slow for me because most of the values are distinct and I'm only interested in the NaN count.

like image 537
user3799307 Avatar asked Oct 08 '14 21:10

user3799307


People also ask

Does pandas count NaN?

The count property directly gives the count of non-NaN values in each column. So, we can get the count of NaN values, if we know the total number of observations.

How do I count values in a Pandas DataFrame?

How do you Count the Number of Occurrences in a data frame? To count the number of occurrences in e.g. a column in a dataframe you can use Pandas value_counts() method. For example, if you type df['condition']. value_counts() you will get the frequency of each unique value in the column “condition”.

Does count function count NaN values?

Person. COUNT(expression) returns the number of values in expression, which is a table column name or an expression that evaluates to a column of data. COUNT(expression) does not count NULL values. This query returns the number of non-NULL values in the Name column of Sample.


1 Answers

You can use the isna() method (or it's alias isnull() which is also compatible with older pandas versions < 0.21.0) and then sum to count the NaN values. For one column:

In [1]: s = pd.Series([1,2,3, np.nan, np.nan])  In [4]: s.isna().sum()   # or s.isnull().sum() for older pandas versions Out[4]: 2 

For several columns, it also works:

In [5]: df = pd.DataFrame({'a':[1,2,np.nan], 'b':[np.nan,1,np.nan]})  In [6]: df.isna().sum() Out[6]: a    1 b    2 dtype: int64 
like image 139
joris Avatar answered Sep 19 '22 16:09

joris