When applying regularized logistic regression: I split my data into training, cross-validation and test sets. I want to apply regularization and am working on choosing the regularization parameter lambda. To do so, I try different values of lambda and fit the parameter theta of my hypothesis on the training set. Then, I choose the value of lambda that gives me the lowest cost function on the validation set. To do so, shall I compute the cost function of the validation set with the penalization term or without it?
This is mixing up two things. You minimize the cost function (with regularization term) to pick the model parameters (for given hyperparameters like lambda). But then the parameters let you classify points in the validation set. And you measure how correctly the classification matches ground truth. You pick lambda that gives the most correct answers. The cost function with lambda plays no role at that stage.
You may draw the learning curve, the both training and validation error converging to a small value, and picked the parameter corresponding to the smallest error as the regularization parameter.
The option of regularization parameter has nothing to do with the cost function value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With