Numpy linear regression with regularization

Question

I'm not seeing what is wrong with my code for regularized linear regression. Unregularized I have simply this, which I'm reasonably certain is correct:

import numpy as np

def get_model(features, labels):
    return np.linalg.pinv(features).dot(labels)

Here's my code for a regularized solution, where I'm not seeing what is wrong with it:

def get_model(features, labels, lamb=0.0):
    n_cols = features.shape[1]
    return linalg.inv(features.transpose().dot(features) + lamb * np.identity(n_cols))\
            .dot(features.transpose()).dot(labels)

With the default value of 0.0 for lamb, my intention is that it should give the same result as the (correct) unregularized version, but the difference is actually quite large.

Does anyone see what the problem is?

nullas · Accepted Answer

The problem is:

features.transpose().dot(features) may not be invertible. And numpy.linalg.inv works only for full-rank matrix according to the documents. However, a (non-zero) regularization term always makes the equation nonsingular.

By the way, you are right about the implementation. But it is not efficient. An efficient way to solve this equation is the least squares method.

np.linalg.lstsq(features, labels) can do the work for np.linalg.pinv(features).dot(labels).

In a general way, you can do this

def get_model(A, y, lamb=0):
    n_col = A.shape[1]
    return np.linalg.lstsq(A.T.dot(A) + lamb * np.identity(n_col), A.T.dot(y))

Numpy linear regression with regularization

Tags:

python

machine-learning

numpy

linear-regression

Marshall Farrier

1 Answers

nullas

Recent Activity

Donate For Us

Numpy linear regression with regularization

Tags:

python

machine-learning

numpy

linear-regression

Marshall Farrier

1 Answers

nullas

Related questions

Recent Activity

Donate For Us