I'm completely new to Python (and this website) and am currently trying to replace NA values in specific dataframe columns with their mode. I've tried various methods which are not working. Please help me spot what I'm doing incorrectly:
Note: All the columns I'm working with are float64
types. All my codes run but when I check the null amount with df[cols_mode].isnull().sum()
in the columns, it remains the same.
Method 1:
cols_mode = ['race', 'goal', 'date', 'go_out', 'career_c']
df[cols_mode].apply(lambda x: x.fillna(x.mode, inplace=True))
I tried the Imputer method too but encountered the same result
Method 2:
for column in df[['race', 'goal', 'date', 'go_out', 'career_c']]:
mode = df[column].mode()
df[column] = df[column].fillna(mode)
Method 3:
df['race'].fillna(df.race.mode(), inplace=True)
df['goal'].fillna(df.goal.mode(), inplace=True)
df['date'].fillna(df.date.mode(), inplace=True)
df['go_out'].fillna(df.go_out.mode(), inplace=True)
df['career_c'].fillna(df.career_c.mode(), inplace=True)
Method 4: My methods become more and more of a manual process and finally this one works:
df['race'].fillna(2.0, inplace=True)
df['goal'].fillna(1.0, inplace=True)
df['date'].fillna(6.0, inplace=True)
df['go_out'].fillna(2.0, inplace=True)
df['career_c'].fillna(2.0, inplace=True)
mode
returns a Series, so you still need to access the row you want before replacing NaN
values in your DataFrame.
for column in ['race', 'goal', 'date', 'go_out', 'career_c']:
df[column].fillna(df[column].mode()[0], inplace=True)
If you want to apply it to all the columns of the DataFrame, then:
for column in df.columns:
df[column].fillna(df[column].mode()[0], inplace=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With