I have pandas DataFrame df. I want to encode continuous and categorical features of df using different encoders. I find it very comfortable to use make_column_transformer, but the code shown below fails with LabelEncoder(), but works fine with OneHotEncoder(handle_unknown='ignore')). The error message is:
TypeError: fit_transform() takes 2 positional arguments but 3 were given
It's not clear to me how to fix this issue.
The code:
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import RobustScaler, OneHotEncoder, LabelEncoder
continuous_features = ['COL1','COL2']       
categorical_features = ['COL3','COL4']
column_trans = make_column_transformer(
    (categorical_features,LabelEncoder()),
    (continuous_features, RobustScaler()))
X_enc = column_trans.fit_transform(df)
According to https://scikit-learn.org/stable/modules/generated/sklearn.compose.make_column_transformer.html.
make_column_transformer(
...     (StandardScaler(), ['numerical_column']),
...     (OneHotEncoder(), ['categorical_column']))
So for your case:
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import RobustScaler, OneHotEncoder, LabelEncoder
continuous_features = ['COL1','COL2']       
categorical_features = ['COL3','COL4']
column_trans = make_column_transformer(
    (OneHotEncoder(), categorical_features),
    (RobustScaler(), continuous_features))
X_enc = column_trans.fit_transform(df)
If you want to use LabelEncoder(), you can only pass one column, not two!
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With