Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Level NaN must be same as name

I am trying to count how many times NaN appears in a column of a dataframe using this code:

count = enron_df.loc['salary'].count('NaN')

But every time i run this i get the following error:

KeyError: 'Level NaN must be same as name (None)'

I searched around the web a lot trying to find a solution, but to no avail.

like image 705
Ian Dzindo Avatar asked Apr 13 '18 13:04

Ian Dzindo


People also ask

How do I fix NaN in Python?

We can replace NaN values with 0 to get rid of NaN values. This is done by using fillna() function. This function will check the NaN values in the dataframe columns and fill the given value.

What does NaN mean in pandas?

The official documentation for pandas defines what most developers would know as null values as missing or missing data in pandas. Within pandas, a missing value is denoted by NaN .

What does unstack () do in Python?

Unstack is also similar to the stack method, it returns a DataFrame having a new level of column labels. It has 2 parameters which are level and fill_value. The level parameter takes an integer, string, list of these, and the Default value is 1 (1 is the last level).


2 Answers

If NaNs are missing values:

enron_df = pd.DataFrame({'salary':[np.nan, np.nan, 1, 5, 7]})
print (enron_df)
   salary
0     NaN
1     NaN
2     1.0
3     5.0
4     7.0

count = enron_df['salary'].isna().sum()
#alternative
#count = enron_df['salary'].isnull().sum()
print (count)
2

If NaNs are strings:

enron_df = pd.DataFrame({'salary':['NaN', 'NaN', 1, 5, 'NaN']})
print (enron_df)
  salary
0    NaN
1    NaN
2      1
3      5
4    NaN

count = enron_df['salary'].eq('NaN').sum()
#alternative
#count = (enron_df['salary'] == 'NaN').sum()
print (count)
3
like image 138
jezrael Avatar answered Sep 19 '22 20:09

jezrael


Try like this:

count = df.loc[df['salary']=='NaN'].shape[0]

Or maybe better:

count = df.loc[df['salary']=='NaN', 'salary'].size

And, going down your path, you'd need something like this:

count = df.loc[:, 'salary'].str.count('NaN').sum()
like image 36
zipa Avatar answered Sep 19 '22 20:09

zipa