Attempting to work with something that looks a little like this:
CATEGORY | NUMBER VALUE | ID
FRUIT | 15 | XCD
VEGGIES | 12 | ZYK
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
data = data.iloc[:,:].values
enc = LabelEncoder()
data[:,0]=enc .fit_transform(data[:,0])
data
array([[1, 15, 'XCD'],
[2, 12, 'ZYK']])
Then...
from sklearn.compose import ColumnTransformer
ct = ColumnTransformer(transformers=[('encode',OneHotEncoder,[0])],remainder='passthrough')
dataset = np.array(ct.fit_transform(data))
gives
TypeError: Cannot clone object. You should provide an instance of scikit-learn estimator instead of a class.
I believe I resolved this one. The TypeError is pretty self explanatory and I'm ashamed for not recognizing this before posting my question. Essentially I just needed to create an instance of the OneHotEncoder class. Adding one line as shown in the code below resolved my situation. Thank you!
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
oHe = OneHotEncoder()
ct = ColumnTransformer(transformers=[('encode',oHe,[0])],remainder='passthrough')
dataset = np.array(ct.fit_transform(data))
I had faced similar issue when fitting RandomizedSearchCV
in xgboost
. Just like said above, I also felt ashamed for not identifying this simple error. I typed
regressor = xgboost.XGBRegressor
instead of
regressor = xgboost.XGBRegressor().
After reading here, I spent sometime to identify this error and it worked fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With