Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace NA values with mode of a DataFrame column in python?

I'm completely new to Python (and this website) and am currently trying to replace NA values in specific dataframe columns with their mode. I've tried various methods which are not working. Please help me spot what I'm doing incorrectly:

Note: All the columns I'm working with are float64 types. All my codes run but when I check the null amount with df[cols_mode].isnull().sum() in the columns, it remains the same.

Method 1:

cols_mode = ['race', 'goal', 'date', 'go_out', 'career_c']

df[cols_mode].apply(lambda x: x.fillna(x.mode, inplace=True))

I tried the Imputer method too but encountered the same result

Method 2:

for column in df[['race', 'goal', 'date', 'go_out', 'career_c']]:
    mode = df[column].mode()
    df[column] = df[column].fillna(mode)

Method 3:

df['race'].fillna(df.race.mode(), inplace=True)
df['goal'].fillna(df.goal.mode(), inplace=True)
df['date'].fillna(df.date.mode(), inplace=True)
df['go_out'].fillna(df.go_out.mode(), inplace=True)
df['career_c'].fillna(df.career_c.mode(), inplace=True)

Method 4: My methods become more and more of a manual process and finally this one works:

df['race'].fillna(2.0, inplace=True)
df['goal'].fillna(1.0, inplace=True)
df['date'].fillna(6.0, inplace=True)
df['go_out'].fillna(2.0, inplace=True)
df['career_c'].fillna(2.0, inplace=True) 
like image 919
abcdapples Avatar asked Nov 15 '16 20:11

abcdapples


1 Answers

mode returns a Series, so you still need to access the row you want before replacing NaN values in your DataFrame.

for column in ['race', 'goal', 'date', 'go_out', 'career_c']:
    df[column].fillna(df[column].mode()[0], inplace=True)

If you want to apply it to all the columns of the DataFrame, then:

for column in df.columns:
    df[column].fillna(df[column].mode()[0], inplace=True)
like image 120
Diego Mora Cespedes Avatar answered Oct 04 '22 06:10

Diego Mora Cespedes