Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas Dataframe, remove all rows where 'None' is the value in any column

I have a large dataframe. When it was created 'None' was used as the value where a number could not be calculated (instead of 'nan')

How can I delete all rows that have 'None' in any of it's columns? I though I could use df.dropna and set the value of na, but I can't seem to be able to.

Thanks

I think this is a good representation of the dataframe:

temp = pd.DataFrame(data=[['str1','str2',2,3,5,6,76,8],['str3','str4',2,3,'None',6,76,8]])
like image 310
jlt199 Avatar asked Aug 04 '17 17:08

jlt199


People also ask

How do you drop rows of Pandas DataFrame whose values in a certain column is NaN?

When it comes to dropping null values in pandas DataFrames, pandas. DataFrame. dropna() method is your friend. When you call dropna() over the whole DataFrame without specifying any arguments (i.e. using the default behaviour) then the method will drop all rows with at least one missing value.

How do I get rid of none rows in Pandas?

pandas. DataFrame. dropna() is used to drop/remove columns with NaN / None values.

How do I delete rows in Pandas DataFrame based on condition?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).


2 Answers

Setup
Borrowed @MaxU's df

df = pd.DataFrame([
    [1, 2, 3],
    [4, None, 6],
    [None, 7, 8],
    [9, 10, 11]
], dtype=object)

Solution
You can just use pd.DataFrame.dropna as is

df.dropna()

   0   1   2
0  1   2   3
3  9  10  11

Supposing you have None strings like in this df

df = pd.DataFrame([
    [1, 2, 3],
    [4, 'None', 6],
    ['None', 7, 8],
    [9, 10, 11]
], dtype=object)

Then combine dropna with mask

df.mask(df.eq('None')).dropna()

   0   1   2
0  1   2   3
3  9  10  11

You can ensure that the entire dataframe is object when you compare with.

df.mask(df.astype(object).eq('None')).dropna()

   0   1   2
0  1   2   3
3  9  10  11
like image 187
piRSquared Avatar answered Oct 16 '22 06:10

piRSquared


Thanks for all your help. In the end I was able to get

df = df.replace(to_replace='None', value=np.nan).dropna()

to work. I'm not sure why your suggestions didn't work for me.

like image 27
jlt199 Avatar answered Oct 16 '22 07:10

jlt199