Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using a variable for defining the training column in the predict function in R

Tags:

r

r-caret

Let us say that we have the following code (the training/testing partition for this problem is irrelevant).

library(caret)
data(iris)
train( Species ~ .,data=iris, method="rf" )  

Now this runs fine. What I want to be able to do is to provide the column I am trying to predict using a variable (because I am going to get it from a GUI). Let us use the example code below:

library(caret)
data(iris)
colName <- 'Species'
train( colName ~ .,data=iris, method="rf" )  

This does not work because colName is not one of the columns in the dataset. So is there a way of doing this? I have searched high and low and came up with nothing. Someone please help me :(.

like image 451
ssm Avatar asked Mar 24 '26 08:03

ssm


1 Answers

This is a simple enough case so using paste in the following way should be fine:

library(caret)
data(iris)
colName <- 'Species'

#create the formula using as.formula and paste
formula <- as.formula(paste(colName, ' ~ .' ))

#run model
train(formula, data=iris, method="rf" )  

Output:

> train( formula,data=iris, method="rf" )
Random Forest 

150 samples
  4 predictor
  3 classes: 'setosa', 'versicolor', 'virginica' 

No pre-processing
Resampling: Bootstrapped (25 reps) 

Summary of sample sizes: 150, 150, 150, 150, 150, 150, ... 

Resampling results across tuning parameters:

  mtry  Accuracy   Kappa      Accuracy SD  Kappa SD  
  2     0.9481249  0.9216819  0.02790700   0.04200793
  3     0.9473557  0.9205465  0.02893104   0.04347956
  4     0.9466284  0.9194525  0.02920803   0.04388548

Accuracy was used to select the optimal model using  the largest value.
The final value used for the model was mtry = 2. 
like image 148
LyzandeR Avatar answered Mar 27 '26 00:03

LyzandeR



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!