Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The code "df.dropna" in python erases my entire data frame, what is wrong with my code?

I want to drop all NaN variables in one of my columns but when I use df.dropna(axis=0, inplace=True) it erases my entire dataframe. Why is this happening?

I've used both df.dropna and df.dropna(axis=0, inplace=True) and it doesn't work to remove NaN.

I'm binning my data so i can run a gaussian model but I can't do that with NaN variables, I want to remove them and still have my dataframe to run the model.

Before and AFter Data

enter image description here

like image 341
Piper Ramirez Avatar asked Apr 03 '19 14:04

Piper Ramirez


People also ask

Does Dropna remove the whole row?

EXAMPLE 1: Remove row with any missing value Remember when we created our DataFrame, rows 0, 2, 3, and 7 all contained missing values. Here, after using dropna() , rows 0, 2, 3, and 7 have all been removed. That's really all dropna does! It removes rows with missing values (it understands that NaN is a missing value).

What does method DF Dropna () do?

The dropna() method removes the rows that contains NULL values. The dropna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the dropna() method does the removing in the original DataFrame instead.

Which code deletes a column from a DataFrame?

del is also an option, you can delete a column by del df['column name'] . The Python would map this operation to df. __delitem__('column name') , which is an internal method of DataFrame .


2 Answers

Not sure about your case, but sharing the solution that worked on my case:

The ones that didn't work:

df = df.dropna() #==> make the df empty.
df = df.dropna(axis=0, inplace=True) #==> make the df empty.
df.dropna(axis=0, inplace=True) #==> make the df empty.

The one that worked:

df.dropna(how='all',axis=0, inplace=True) #==> Worked very well...

Thanks to Anky above for his comment.

like image 89
HassanSh__3571619 Avatar answered Oct 01 '22 10:10

HassanSh__3571619


Default 'dropna' command uses 'how=any' , which means that it would delete each row which has 'any' NaN

This, as you found out, delete rows which have 'all' NaN columns

df.dropna(how='all', inplace=True)

or, more basic:

newDF = df.dropna(how='all')
like image 30
Lorenzo Bassetti Avatar answered Oct 01 '22 09:10

Lorenzo Bassetti