Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tuning integer vector in mlr

Tags:

r

mlr

I am creating custom learners, in particular I am trying to use the h2o machine learning algorithms within the mlr framework. The 'hidden' parameter of the h2o.deeplearning function, is an integer vector which I want to tune. I defined the 'hidden' parameter in the following way:

makeRLearner.classif.h2o_dl = function() {
makeRLearnerClassif(
cl = "classif.h2o_dl",
package = "h2o",
par.set = makeParamSet(
  makeDiscreteLearnerParam(id = "activation",
    values = c("Rectifier", "Tanh", "TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout")),
  makeNumericLearnerParam(id = "epochs", default = 10, lower = 1),
  makeNumericLearnerParam(id = "rate", default = 0.005, lower = 0, upper = 1),
  makeIntegerVectorLearnerParam(id = "hidden", default = c(100,100)),
  makeDiscreteLearnerParam(id = "loss", values = c("Automatic",
            "CrossEntropy", "Quadratic", "Absolute", "Huber"))
  ),
properties = c("twoclass", "multiclass", "numerics", "factors", "prob","missings"),
name = "Deep Learning Neural Network with h2o",
short.name = "h2o_deeplearning_classif",
note = "tbd"
)
}

trainLearner.classif.h2o_dl = function(.learner, .task,.subset,.weights=NULL, ...) {
f = getTaskFormula(.task)
data = getTaskData(.task, .subset)
data_h2o <- as.h2o(data,
                 destination_frame = paste0(
                   "train_",
                   format(Sys.time(), "%m%d%y_%H%M%S")))
h2o::h2o.deeplearning(x = getTaskFeatureNames(.task),
           y = setdiff(names(getTaskData(.task)),
                       getTaskFeatureNames(.task)),
           training_frame = data_h2o, ...)
}

predictLearner.classif.h2o_dl = function(.learner, .model, .newdata, predict.method = "plug-in", ...) {
data <- as.h2o(.newdata,
             destination_frame = paste0("pred_",
                                        format(Sys.time(), "%m%d%y_%H%M%S")))
p = predict(.model$learner.model, newdata = data, method = predict.method, ...)
if (.learner$predict.type == "response") 
return(as.data.frame(p)[,1]) else return(as.matrix(as.numeric(p))[,-1])
}

I tried tuning the parameter 'hidden' via grid search by means of the makeDiscreteParam function:

library(mlr)
library(h2o)
h2o.init()

lrn.h2o <- makeLearner("classif.h2o_dl")
n <- getTaskSize(sonar.task)
train.set = seq(1, n, by = 2)
test.set = seq(2, n, by = 2)
mod.h2o = train(lrn.h2o, sonar.task, subset = train.set)
pred.h2o <- predict(mod.h2o,task= sonar.task, subset = train.set)

ctrl = makeTuneControlGrid()
rdesc = makeResampleDesc("CV", iters = 3L)
ps = makeParamSet(
makeDiscreteParam("hidden", values = list(c(10,10),c(100,100))),
makeDiscreteParam("rate", values = c(0.1,0.5))
)

res = tuneParams("classif.h2o_dl", task = sonar.task, resampling = rdesc,par.set = ps,control = ctrl)

which resulted in the warning message

Warning messages:
1: In checkValuesForDiscreteParam(id, values) :
 number of items to replace is not a multiple of replacement length
2: In checkValuesForDiscreteParam(id, values) :
 number of items to replace is not a multiple of replacement length

and ps looks like this:

ps
           Type len Def  Constr Req Tunable Trafo
hidden discrete   -   -  10,100   -    TRUE     -
rate   discrete   -   - 0.1,0.5   -    TRUE     -

which does not result in tuning the hidden parameter as a vector. I also tried other special constructor function (e.g. makeNumericVectorParam) which did not work either. Has anyone experience in tuning (integer) vectors in mlr and could give me a hint?

like image 801
ptr_ Avatar asked Mar 01 '16 10:03

ptr_


2 Answers

To tune "hidden" parameter use this piece of code in the grid:

makeDiscreteParam(id = "hidden", values = list(a = c(10,10), b = c(100,100)))

Check this out:

https://github.com/mlr-org/mlr/issues/1305

like image 145
perevales Avatar answered Oct 28 '22 13:10

perevales


The reason for the warning messages and failure to construct the proper ParamSet is that ParamHelpers tries to add names to the list of values, which fails when values are vectors. perevales answer solves this issue and that's why it works.

However, when you want to tune a vector of integer values, it is probably most advisable to use makeIntegerVectorParam:

ps <- makeParamSet(
  makeIntegerVectorParam("hidden", len = 2, lower = 10, upper = 100),
  makeDiscreteParam("rate", values = c(0.1, 0.5))
)

This will not only try c(10, 10) and c(100, 100), but also values like c(10, 100).

In fact, this also considers all values between 10 and 100 (e.g. c(30, 80)), so it may be desirable to reduce the search space a little, using transformations. Example:

ps <- makeParamSet(
  makeIntegerVectorParam("hidden", len = 2, lower = 2, upper = 4,
    trafo = function(x) round(10 ^ (x / 2))),
  makeDiscreteParam("rate", values = c(0.1, 0.5))
)

Which uses the values 10 (=10^1), 32 (=10^1.5) and 100 (=10^2) in any combination for hidden layers.

like image 40
mb706 Avatar answered Oct 28 '22 13:10

mb706