I want to drop all NaN variables in one of my columns but when I use df.dropna(axis=0, inplace=True)
it erases my entire dataframe. Why is this happening?
I've used both df.dropna
and df.dropna(axis=0, inplace=True)
and it doesn't work to remove NaN.
I'm binning my data so i can run a gaussian model but I can't do that with NaN variables, I want to remove them and still have my dataframe to run the model.
Before and AFter
EXAMPLE 1: Remove row with any missing value Remember when we created our DataFrame, rows 0, 2, 3, and 7 all contained missing values. Here, after using dropna() , rows 0, 2, 3, and 7 have all been removed. That's really all dropna does! It removes rows with missing values (it understands that NaN is a missing value).
The dropna() method removes the rows that contains NULL values. The dropna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the dropna() method does the removing in the original DataFrame instead.
del is also an option, you can delete a column by del df['column name'] . The Python would map this operation to df. __delitem__('column name') , which is an internal method of DataFrame .
Not sure about your case, but sharing the solution that worked on my case:
The ones that didn't work:
df = df.dropna() #==> make the df empty.
df = df.dropna(axis=0, inplace=True) #==> make the df empty.
df.dropna(axis=0, inplace=True) #==> make the df empty.
The one that worked:
df.dropna(how='all',axis=0, inplace=True) #==> Worked very well...
Thanks to Anky above for his comment.
Default 'dropna' command uses 'how=any' , which means that it would delete each row which has 'any' NaN
This, as you found out, delete rows which have 'all' NaN columns
df.dropna(how='all', inplace=True)
or, more basic:
newDF = df.dropna(how='all')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With