Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass PCA preprocessing arguments to train()

I'm trying to build a predictive model in caret using PCA as pre-processing. The pre-processing would be as follows:

preProc <- preProcess(IL_train[,-1], method="pca", thresh = 0.8)

Is it possible to pass the thresh argument directly to caret's train() function? I've tried the following, but it doesn't work:

modelFit_pp <- train(IL_train$diagnosis ~ . , preProcess="pca",
                            thresh= 0.8, method="glm", data=IL_train)

If not, how can I pass the separate preProc results to the train() function?

like image 585
Timm S. Avatar asked Apr 14 '15 08:04

Timm S.


1 Answers

As per the documentation, you specify additional preprocessing arguments with trainControl

?trainControl

...
preProcOptions  

A list of options to pass to preProcess. The type of pre-processing 
(e.g. center, scaling etc) is passed in via the preProc option in train.
...

Since your dataset is not reproducible, let's look at an example. I will use the Sonar dataset from mlbench and use the pls algorithm just for fun.

library(caret)
library(mlbench)

data(Sonar)

ctrl <- trainControl(preProcOptions = list(thresh = 0.95))

mod <- train(Class ~ ., 
             data = Sonar, 
              method = "pls",
              trControl = ctrl)

Although documentation isn't the most exciting read, definitely make sure to try to go through it. Package authors work hard to create documentation and there are many wonders to be found within.

like image 191
cdeterman Avatar answered Nov 07 '22 11:11

cdeterman