Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Ridge model fitting show warning when power of the denominator in the alpha value is raised to 13 or more?

I was trying to create a loop to find out the variations in the accuracy scores of the train and test sets of Boston housing data set fitted with a Ridge regression model.

This was the for loop:

for i in range(1,20):
        Ridge(alpha = 1/(10**i)).fit(X_train,y_train)

It showed a warning beginning from i=13.

The warning being:

LinAlgWarning: Ill-conditioned matrix (rcond=6.45912e-17): result may not be accurate.
  overwrite_a=True).T

What is the meaning of this warning? And is it possible to get rid of it?

I checked to execute it separately without a loop, still didn't help.

#importing libraries and packages

import mglearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge

#importing boston housing dataset from mglearn
X,y = mglearn.datasets.load_extended_boston()

#Splitting the dataset
X_train,X_test,y_train,y_test = train_test_split(X,y,random_state=0)

#Fitting the training data using Ridge model with alpha = 1/(10**13)
rd = Ridge(alpha = 1/(10**13)).fit(X_train,y_train)

Shoudn't display the warning mentioned above for any value of i.

like image 435
Vinay Goudar Avatar asked Jan 26 '23 17:01

Vinay Goudar


2 Answers

Try fitting your Ridge model with normalization: Ridge(normalize=True). I ran into the same error as you, and it was because my features included both extremely large and extremely small values, which were causing problems with the underlying linear algebra solver used to fit the model.

like image 155
Geoff Avatar answered Feb 02 '23 10:02

Geoff


In Ridge Regression you construct at Kernel Matrix which contains the similarities between all of your training labels. The parameters of the Ridge Regression fit are found from both this Kernel Matrix and your training labels. If you have e.g. two samples that are extremely similar, the matrix to be solved will be over determined. To get around this, a small value can be added to the diagonal, and that value is the alpha parameter that you give. So what happens is that when your alpha value approaches 0, the matrix is more likely to be over determined (but depends on the nature of your data). But this should show up as poor cross validation accuracy, so you don't have to worry too much about it.

So all in all if you keep your alpha above the warning threshold you will be fine, and in a cross validation procedure, the alpha value would likely be selected to be above this threshold anyways.

like image 42
user2653663 Avatar answered Feb 02 '23 08:02

user2653663