I am trying to oneHotEncode the categorical variables of my Pandas dataframe, which includes both categorical and continues variables. I realise this can be done easily with the pandas .get_dummies() function, but I need to use a pipeline so I can generate a PMML-file later on.
This is the code to create a mapper. The categorical variables I would like to encode are stored in a list called 'dummies'.
from sklearn_pandas import DataFrameMapper
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
mapper = DataFrameMapper(
[(d, LabelEncoder()) for d in dummies] +
[(d, OneHotEncoder()) for d in dummies]
)
And this is the code to create a pipeline, including the mapper and linear regression.
from sklearn2pmml import PMMLPipeline
from sklearn.linear_model import LinearRegression
lm = PMMLPipeline([("mapper", mapper),
("regressor", LinearRegression())])
When I now try to fit (with 'features' being a dataframe, and 'targets' a series), it gives an error 'could not convert string to float'.
lm.fit(features, targets)
Anyone who can help me out? I am desperate for working pipelines including the preprocessing of data... Thanks in advance!
OneHotEncoder. Encode categorical integer features using a one-hot aka one-of-K scheme. The input to this transformer should be a matrix of integers, denoting the values taken on by categorical (discrete) features. The output will be a sparse matrix where each column corresponds to one possible value of one feature.
OneHotEncoder
doesn't support string features, and with [(d, OneHotEncoder()) for d in dummies]
you are applying it to all dummies columns. Use LabelBinarizer
instead:
mapper = DataFrameMapper(
[(d, LabelBinarizer()) for d in dummies]
)
An alternative would be to use the LabelEncoder
with a second OneHotEncoder
step.
mapper = DataFrameMapper(
[(d, LabelEncoder()) for d in dummies]
)
lm = PMMLPipeline([("mapper", mapper),
("onehot", OneHotEncoder()),
("regressor", LinearRegression())])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With