After applying Imputer.fit_transform() on my dataset I am losing the column names on the transformed data frame. Is there any way to impute it without losing column names??
As I said in the comment to the question, just replace (re-assign) the values in the dataframe with the data returned from the Imputer.
Lets say this is your dataframe:
import numpy as np
import pandas as pd
df = pd.DataFrame(data=[[1,2,3],
[3,4,4],
[3,5,np.nan],
[6,7,8],
[3,np.nan,1]],
columns=['A', 'B', 'C'])
Current df:
A B C
0 1 2.0 3.0
1 3 4.0 4.0
2 3 5.0 NaN
3 6 7.0 8.0
4 3 NaN 1.0
If you are sending whole the df to Imputer, just use this:
df[df.columns] = Imputer().fit_transform(df)
If you are sending only some columns, then use those columns only to assign the results:
columns_to_impute = ['B', 'C']
df[columns_to_impute] = Imputer().fit_transform(df[columns_to_impute])
Output:
A B C
0 1.0 2.0 3.0
1 3.0 4.0 4.0
2 3.0 5.0 4.0
3 6.0 7.0 8.0
4 3.0 4.5 1.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With