Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XGBoost Error in R Studio ("'data' has class 'character' and length...")

I am having difficulties fitting my data to an xgboost classifier model. When I run this:

classifier = xgboost(data = as.matrix(training_set[c(4:15, 17:18,20:28)]), 
  label = training_set$posted_ind, nrounds = 10)

R Studio tells me:

Error in xgb.DMatrix(data, label = label, missing = missing) : 
'data' has class 'character' and length 1472000.
'data' accepts either a numeric matrix or a single filename. 

The training set data has both continuous and categorical data, but all categorical data has been encoded as such (and the same data fit to random forest and naive bayes models). Is there some additional step I need to complete so that I can use these data in an xgboost model?

like image 372
user10360304 Avatar asked Mar 06 '23 10:03

user10360304


1 Answers

Make sure that your "training_set" does not have any columns that are factors. If you encoded your categorical variables as numeric but casted them as factors, you will get this error.

like image 77
user11442360 Avatar answered Mar 15 '23 11:03

user11442360