Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate the regularization parameter in linear regression

When we have a high degree linear polynomial that is used to fit a set of points in a linear regression setup, to prevent overfitting, we use regularization, and we include a lambda parameter in the cost function. This lambda is then used to update the theta parameters in the gradient descent algorithm.

My question is how do we calculate this lambda regularization parameter?

like image 236
London guy Avatar asked Aug 29 '12 16:08

London guy


People also ask

What is regularization parameter in linear regression?

Regularization in Linear Regression Regularization is a technique in machine learning that tries to achieve the generalization of the model. It means that our model works well not only with training or test data, but also with the data it'll receive in the future.

How is the value of the regularization parameter determined?

The optimal regularization parameter is determined by QFM. The curve of QF is shown in Figure 2. Figure 2 indicates that when , the values of QF are far away from 1. When , almost all the values of QF are equal to 1, which satisfies the needs of QFM algorithm.

How can we determine the optimal regularization parameter for linear regression?

One approach you can take is to randomly subsample your data a number of times and look at the variation in your estimate. Then repeat the process for a slightly larger value of lambda to see how it affects the variability of your estimate.

What is the regularization parameter?

The regularization parameter is a control on your fitting parameters. As the magnitues of the fitting parameters increase, there will be an increasing penalty on the cost function. This penalty is dependent on the squares of the parameters as well as the magnitude of .


1 Answers

The regularization parameter (lambda) is an input to your model so what you probably want to know is how do you select the value of lambda. The regularization parameter reduces overfitting, which reduces the variance of your estimated regression parameters; however, it does this at the expense of adding bias to your estimate. Increasing lambda results in less overfitting but also greater bias. So the real question is "How much bias are you willing to tolerate in your estimate?"

One approach you can take is to randomly subsample your data a number of times and look at the variation in your estimate. Then repeat the process for a slightly larger value of lambda to see how it affects the variability of your estimate. Keep in mind that whatever value of lambda you decide is appropriate for your subsampled data, you can likely use a smaller value to achieve comparable regularization on the full data set.

like image 171
bogatron Avatar answered Sep 22 '22 14:09

bogatron