How to use custom cross validation folds with XGBoost

Question

I'm using the R wrapper for XGBoost. In the function xgb.cv, there is a folds parameter with the description

list provides a possibility of using a list of pre-defined CV folds (each element must be a vector of fold's indices). If folds are supplied, the nfold and stratified parameters would be ignored.

So, do I just specify the indices for training the model and assume the rest will be for testing? For example, if my training data is something like

    Feature1 Feature2 Target
 1:        2       10     10
 2:        7        1      9
 3:        8        2      3
 4:        8       10      7
 5:        8        2      9
 6:        3        7      3

and I want to cross validate using (train, test) indices as ((1,2,3), (4,5,6)) and ((4,5,6), (1,2,3)) do I set folds=list(c(1,2,3), c(4,5,6))?

Ben · Accepted Answer

Through some trial and error I figured out that xgboost is using the passed indices as indices of the test folds. Confirmed this by noticing the current devel version of xgboost explicitly states it in the documentation.

Andrew Olney · Answer

Here is an example for both generating the folds and using them.

Assume in our dataframe we have a column of ids, such that we want to put all rows with a given id value in a fold.

The code below

finds the unique ids
preallocates a list for the folds
iterates over ids, creating lists of row indices that match

fold.ids <- unique(df$id) custom.folds <- vector("list", length(fold.ids)) i <- 1 for( id in fold.ids){ custom.folds[[i]] <- which( df$id %in% id ) i <- i+1 }

Here is an example using the above fold list in xgb.cv

res <- xgb.cv(param, dtrain, nround, folds=custom.folds, prediction = TRUE)

Reasonable values for other xgb.cv parameters can be found in the documentation

How to use custom cross validation folds with XGBoost

Tags:

r

xgboost

Ben

2 Answers

Ben

Andrew Olney

Recent Activity

Donate For Us

How to use custom cross validation folds with XGBoost

Tags:

r

xgboost

Ben

2 Answers

Ben

Andrew Olney

Related questions

Recent Activity

Donate For Us