Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use custom cross validation folds with XGBoost

Tags:

r

xgboost

I'm using the R wrapper for XGBoost. In the function xgb.cv, there is a folds parameter with the description

list provides a possibility of using a list of pre-defined CV folds (each element must be a vector of fold's indices). If folds are supplied, the nfold and stratified parameters would be ignored.

So, do I just specify the indices for training the model and assume the rest will be for testing? For example, if my training data is something like

    Feature1 Feature2 Target
 1:        2       10     10
 2:        7        1      9
 3:        8        2      3
 4:        8       10      7
 5:        8        2      9
 6:        3        7      3

and I want to cross validate using (train, test) indices as ((1,2,3), (4,5,6)) and ((4,5,6), (1,2,3)) do I set folds=list(c(1,2,3), c(4,5,6))?

like image 889
Ben Avatar asked Oct 27 '25 19:10

Ben


2 Answers

Through some trial and error I figured out that xgboost is using the passed indices as indices of the test folds. Confirmed this by noticing the current devel version of xgboost explicitly states it in the documentation.

like image 84
Ben Avatar answered Oct 29 '25 07:10

Ben


Here is an example for both generating the folds and using them.

Assume in our dataframe we have a column of ids, such that we want to put all rows with a given id value in a fold.

The code below

  • finds the unique ids
  • preallocates a list for the folds
  • iterates over ids, creating lists of row indices that match

    fold.ids <- unique(df$id) custom.folds <- vector("list", length(fold.ids)) i <- 1 for( id in fold.ids){ custom.folds[[i]] <- which( df$id %in% id ) i <- i+1 }

Here is an example using the above fold list in xgb.cv

res <- xgb.cv(param, dtrain, nround, folds=custom.folds, prediction = TRUE)

Reasonable values for other xgb.cv parameters can be found in the documentation

like image 34
Andrew Olney Avatar answered Oct 29 '25 08:10

Andrew Olney