Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas select all columns without NaN

I have a DF with 200 columns. Most of them are with NaN's. I would like to select all columns with no NaN's or at least with the minimum NaN's. I've tried to drop all with a threshold or with notnull() but without success. Any ideas.

df.dropna(thresh=2, inplace=True)
df_notnull = df[df.notnull()]

DF for example:

col1  col2 col3
23     45  NaN
54     39  NaN
NaN    45  76
87     32  NaN

The output should look like:

 df.dropna(axis=1, thresh=2)

    col1  col2
    23     45  
    54     39  
    NaN    45  
    87     32  
like image 518
Hristo Stoychev Avatar asked Nov 21 '17 13:11

Hristo Stoychev


People also ask

How do you find the columns with no missing values in pandas?

Extract rows/columns with missing values in specific columns/rows. You can use the isnull() or isna() method of pandas. DataFrame and Series to check if each element is a missing value or not. isnull() is an alias for isna() , whose usage is the same.

How do you get the list of columns that have null values in a Dataframes?

You can use df. isnull(). sum() . It shows all columns and the total NaNs of each feature.


1 Answers

You can create with non-NaN columns using

df = df[df.columns[~df.isnull().all()]]

Or

null_cols = df.columns[df.isnull().all()]
df.drop(null_cols, axis = 1, inplace = True)

If you wish to remove columns based on a certain percentage of NaNs, say columns with more than 90% data as null

cols_to_delete = df.columns[df.isnull().sum()/len(df) > .90]
df.drop(cols_to_delete, axis = 1, inplace = True)
like image 101
Vaishali Avatar answered Oct 01 '22 18:10

Vaishali