Pandas - dropping rows with missing data not working using .isnull(), notnull(), dropna()

Tags:

This is really weird. I have tried several ways of dropping rows with missing data from a pandas dataframe, but none of them seem to work. This is the code (I just uncomment one of the methods used - but these are the three that I used in different modifications - this is the latest):

import pandas as pd
Test = pd.DataFrame({'A':[1,2,3,4,5],'B':[1,2,'NaN',4,5],'C':[1,2,3,'NaT',5]})
print(Test)
#Test = Test.ix[Test.C.notnull()]
#Test = Test.dropna()
Test = Test[~Test[Test.columns.values].isnull()]
print "And now"
print(Test)

But in all cases, all I get is this:

   A    B    C
0  1    1    1
1  2    2    2
2  3  NaN    3
3  4    4  NaT
4  5    5    5
And now
   A    B    C
0  1    1    1
1  2    2    2
2  3  NaN    3
3  4    4  NaT
4  5    5    5

Is there any mistake that I am making? or what is the problem? Ideally, I would like to get this:

   A    B    C
0  1    1    1
1  2    2    2
4  5    5    5

593

asked Sep 06 '16 02:09

durbachit

2 Answers

Try this on orig data:

Test.replace(["NaN", 'NaT'], np.nan, inplace = True)
Test = Test.dropna()
Test

Or Modify data and do this

import pandas as pd
import numpy as np 

Test = pd.DataFrame({'A':[1,2,3,4,5],'B':[1,2,np.nan,4,5],'C':[1,2,3,pd.NaT,5]})
print(Test)
Test = Test.dropna()
print(Test)



   A    B  C
0  1  1.0  1
1  2  2.0  2
4  5  5.0  5

187

answered Sep 30 '22 21:09

Merlin

Your example DF has NaN and NaT as strings which .dropna, .notnull and co. won't consider falsey, so given your example you can use...

df[~df.isin(['NaN', 'NaT']).any(axis=1)]

Which gives you:

If you had a DF such as (note of the use of np.nan and np.datetime64('NaT') instead of strings:

df = pd.DataFrame({'A':[1,2,3,4,5],'B':[1,2,np.nan,4,5],'C':[1,2,3,np.datetime64('NaT'),5]})

Then running df.dropna() which give you:

   A    B  C
0  1  1.0  1
1  2  2.0  2
4  5  5.0  5

Note that column B is now a float instead of an integer as that's required to store NaN values.

answered Sep 30 '22 23:09

Jon Clements

Related questions
                            
                                How can I pass configuration variable values into the pyodbc connect command?
                            
                                Get file size from "Content-Length" value from a file in python 3.2
                            
                                How to write to CSV and not overwrite past text
                            
                                Python printing without commas
                            
                                Python - Splitting List That Contains Strings and Integers
                            
                                Sending a Dictionary using Sockets in Python?
                            
                                Python: categorising a list by orders of magnitude
                            
                                Filtering Characters from a String [duplicate]
                            
                                Getting attribute's value using BeautifulSoup
                            
                                Want to seperate the integer part and fractional part of float number in python [duplicate]
                            
                                How to add logging to a file with timestamps to a Python TCP Server for Raspberry Pi
                            
                                Is it possible to override __new__ in an enum to parse strings to an instance?
                            
                                Read file and plot CDF in Python
                            
                                Making a Fast Port Scanner
                            
                                What's the Perl equivalent of Python's enumerate?
                            
                                Remove values that appear only once in a DataFrame column
                            
                                Making external links open in a new window in wagtail
                            
                                Can you open a Python shell in Atom editor?
                            
                                Reverse legend order pandas plot
                            
                                Python: Adding hours to pandas timestamp

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas - dropping rows with missing data not working using .isnull(), notnull(), dropna()

Tags:

python

pandas

durbachit

People also ask

2 Answers

Merlin

Jon Clements

Recent Activity

Donate For Us