I want to find the number of NaN
in each column of my data so that I can drop a column if it has fewer NaN
than some threshold. I looked but wasn't able to find any function for this. value_counts
is too slow for me because most of the values are distinct and I'm only interested in the NaN
count.
The count property directly gives the count of non-NaN values in each column. So, we can get the count of NaN values, if we know the total number of observations.
How do you Count the Number of Occurrences in a data frame? To count the number of occurrences in e.g. a column in a dataframe you can use Pandas value_counts() method. For example, if you type df['condition']. value_counts() you will get the frequency of each unique value in the column “condition”.
Person. COUNT(expression) returns the number of values in expression, which is a table column name or an expression that evaluates to a column of data. COUNT(expression) does not count NULL values. This query returns the number of non-NULL values in the Name column of Sample.
You can use the isna()
method (or it's alias isnull()
which is also compatible with older pandas versions < 0.21.0) and then sum to count the NaN values. For one column:
In [1]: s = pd.Series([1,2,3, np.nan, np.nan]) In [4]: s.isna().sum() # or s.isnull().sum() for older pandas versions Out[4]: 2
For several columns, it also works:
In [5]: df = pd.DataFrame({'a':[1,2,np.nan], 'b':[np.nan,1,np.nan]}) In [6]: df.isna().sum() Out[6]: a 1 b 2 dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With