How to perform a train, test and validation set to predict

Question

I have a really large dataset and i'm trying to build a classification model using R. However I need to use a train, test and validation set. But i'm a bit confused about the way to perform this. For example, I built a tree using a train set and then i computed the predicion using a test set. But I believe that i should be using the train and the test set to best tune the tree and after that use the validation set to validate. How can i do this?

library(rpart)
part.installed <- rpart(TARGET ~  RS_DESC+SAP_STATUS +                         
ACTIVATION_STATUS+ROTUL_STATUS+SIM_STATUS+RATE_PLAN_SEGMENT_NORM,
trainSet, method="class")

part.predictions <- predict(part.installed, testSet, type="class")

(P.S the tree is only an example. It could be another classification algorithm)

Has QUIT--Anony-Mousse · Accepted Answer

Usually the terminology is as follows:

The training set is used to build the classifier
The validation set is used to tune the algorithm hyperparameters repeatedly. So there will be some overfitting here, but that is why there is another stage:
The test set must not be touched until the classifier is final to prevent overfitting. It serves to estimate the true accuracy, if you would put the model into production.

How to perform a train, test and validation set to predict

Tags:

validation

r

machine-learning

classification

training-data

Carolina Leana Santos

1 Answers

Has QUIT--Anony-Mousse

Recent Activity

Donate For Us

How to perform a train, test and validation set to predict

Tags:

validation

r

machine-learning

classification

training-data

Carolina Leana Santos

1 Answers

Has QUIT--Anony-Mousse

Related questions

Recent Activity

Donate For Us