Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in R gbm function when cv.folds > 0

Tags:

r

gbm

I am using gbm to predict binary response. When I set cv.folds=0, everything works well. However when cv.folds > 1, I got error:Error in object$var.levels[[i]] : subscript out of bounds when the first irritation of crossvalidation finished. Someone said this could because some factor variables have missing levels in training or testing data, but I tried only use numeric variables and still get this error.

> gbm.fit <- gbm(model.formula,
+                data=dataall_train,
+                distribution = "adaboost",
+                n.trees=10,
+                shrinkage=0.05,
+                interaction.depth=2,
+                bag.fraction = 0.5,
+                n.minobsinnode = 10,      
+                train.fraction=0.5,
+                cv.folds=3,
+                verbose=T,
+                n.cores=1)
CV: 1 
CV: 2 
CV: 3 
Error in object$var.levels[[i]] : subscript out of bounds

Anyone have some insights on this? Thanks!

Answer my self: Problem solved. This is because a bug in this function. The input data cannot contain variables other than the variables in the model.

like image 967
Yoki Avatar asked Aug 26 '14 20:08

Yoki


1 Answers

I second this solution: The input data in the R function gbm() cannot include the variables (columns) that will not be used in your model.

like image 65
user3375662 Avatar answered Oct 28 '22 20:10

user3375662