I'm trying to run a model with the mlr package but I'm having some problems with the predict() function. It gives me the following error message:
Error in predict(mod, task = task, subset = test) :
Assertion on 'subset' failed: Must be of type 'integerish', not 'data.frame'
Please find a reproducible example below:
require(mlr) # models
require(caTools) # sampling
require(Zelig) # data
data("voteincome")
voteincome$vote <- as.factor(voteincome$vote)
set.seed(0)
sample <- sample.split(voteincome, SplitRatio = .75)
train <- subset(voteincome, sample == TRUE)
test <- subset(voteincome, sample == FALSE)
train <- na.omit(train)
test <- na.omit(test)
task <- makeClassifTask(data = train, target = "vote")
lrnr <- makeLearner("classif.randomForest")
mod <- train(lrnr, task)
pred <- predict(mod, task = task, subset = test)
And then the error appears. Am I doing something wrong? Thanks!
mlr expects an index vector to be passed to the subset argument. It will then subset the data frames automatically, so you don't have to do this yourself. You can also use mlr to do the partitioning into train and test sets automatically with a resample description (see the tutorial):
require(mlr) # models
require(caTools) # sampling
require(Zelig) # data
data("voteincome")
voteincome$vote <- as.factor(voteincome$vote)
set.seed(0)
task <- makeClassifTask(data = voteincome, target = "vote")
lrnr <- makeLearner("classif.randomForest")
rdesc <- makeResampleDesc("Holdout", split = 0.75)
res <- resample(learner = lrnr, task = task, resampling = rdesc)
# get predictions on test set
getPredictionResponse(res$pred)
# compute accuracy, also see https://mlr-org.github.io/mlr-tutorial/devel/html/performance/index.html
performance(res$pred, acc)
Try this:
pred <- predict(mod$learner.model, task = task, subset = test)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With