Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: drop columns with all NaN's

I realize that dropping NaNs from a dataframe is as easy as df.dropna but for some reason that isn't working on mine and I'm not sure why.

Here is my original dataframe:

fish_frame1:                       0   1   2         3   4       5   6          7 0               #0915-8 NaN NaN       NaN NaN     NaN NaN        NaN 1                   NaN NaN NaN  LIVE WGT NaN  AMOUNT NaN      TOTAL 2               GBW COD NaN NaN     2,280 NaN   $0.60 NaN  $1,368.00 3               POLLOCK NaN NaN     1,611 NaN   $0.01 NaN     $16.11 4                 WHAKE NaN NaN       441 NaN   $0.70 NaN    $308.70 5           GBE HADDOCK NaN NaN     2,788 NaN   $0.01 NaN     $27.88 6           GBW HADDOCK NaN NaN    16,667 NaN   $0.01 NaN    $166.67 7               REDFISH NaN NaN       932 NaN   $0.01 NaN      $9.32 8    GB WINTER FLOUNDER NaN NaN       145 NaN   $0.25 NaN     $36.25 9   GOM WINTER FLOUNDER NaN NaN    25,070 NaN   $0.35 NaN  $8,774.50 10        GB YELLOWTAIL NaN NaN        26 NaN   $1.75 NaN     $45.50 

The code that follows is an attempt to drop all NaNs as well as any columns with more than 3 NaNs (either one, or both, should work I think):

fish_frame.dropna() fish_frame.dropna(thresh=len(fish_frame) - 3, axis=1) 

This produces:

fish_frame1 after dropna:                       0   1   2         3   4       5   6          7 0               #0915-8 NaN NaN       NaN NaN     NaN NaN        NaN 1                   NaN NaN NaN  LIVE WGT NaN  AMOUNT NaN      TOTAL 2               GBW COD NaN NaN     2,280 NaN   $0.60 NaN  $1,368.00 3               POLLOCK NaN NaN     1,611 NaN   $0.01 NaN     $16.11 4                 WHAKE NaN NaN       441 NaN   $0.70 NaN    $308.70 5           GBE HADDOCK NaN NaN     2,788 NaN   $0.01 NaN     $27.88 6           GBW HADDOCK NaN NaN    16,667 NaN   $0.01 NaN    $166.67 7               REDFISH NaN NaN       932 NaN   $0.01 NaN      $9.32 8    GB WINTER FLOUNDER NaN NaN       145 NaN   $0.25 NaN     $36.25 9   GOM WINTER FLOUNDER NaN NaN    25,070 NaN   $0.35 NaN  $8,774.50 10        GB YELLOWTAIL NaN NaN        26 NaN   $1.75 NaN     $45.50 

I'm a novice with Pandas so I'm not sure if this isn't working because I'm doing something wrong or I'm misunderstanding something or misusing a function. Any help is appreciated thanks.

like image 367
theprowler Avatar asked Jul 17 '17 14:07

theprowler


People also ask

How do you drop columns that are all NaN Pandas?

Use dropna(axis=0) to drop rows with NaN values from pandas DataFrame.

How do I drop columns with all NA values?

If we need to drop such columns that contain NA, we can use the axis=column s parameter of DataFrame. dropna() to specify deleting the columns. By default, it removes the column where one or more values are missing.

How do I drop a row if all NaN Pandas?

By using dropna() method you can drop rows with NaN (Not a Number) and None values from pandas DataFrame. Note that by default it returns the copy of the DataFrame after removing rows. If you wanted to remove from the existing DataFrame, you should use inplace=True .


1 Answers

From the dropna docstring:

Drop the columns where all elements are NaN:
df.dropna(axis=1, how='all')      A    B    D 0  NaN  2.0  0 1  3.0  4.0  1 2  NaN  NaN  5 
like image 163
Corley Brigman Avatar answered Sep 17 '22 23:09

Corley Brigman