I'm working on building a model using XGBoost to predict corona virus infections based on province and region codes. dataset: https://www.kaggle.com/sudalairajkumar/covid19-in-italy.
I have split the data, but when I try to set up the model, I get the following error:
XGBoostError: [16:16:15] C:/Users/Administrator/workspace/xgboost-
win64_release_1.0.0/src/objective/multiclass_obj.cu:115:
SoftmaxMultiClassObj: label must be in [0, num_class).
Code is as follows:
train = df[['RegionCode','ProvinceCode']].astype(int)
test = df['TotalPositiveCases'].astype(int)
X_test, X_train, y_test, y_train = train_test_split(train, test,
test_size=0.30, random_state=42)
train = xgb.DMatrix(X_train, label=y_train)
test = xgb.DMatrix(X_test, label=y_test)
param = {
'max_depth':4,
'eta':0.3,
'objective': 'multi:softmax',
'num_class': 3}
epochs = 10
model = xgb.train(param, train, epochs)
the model attribute is where I get the error
This error comes when there are more labels in your target features than mentioned in the num_class parameter.
You should check if your target has more features than your num_class parameters or what you can do is to print the target.unique() as there might be some nulls or NAN in the data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With