Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Closed Form Ridge Regression

I am having trouble understanding the output of my function to implement multiple-ridge regression. I am doing this from scratch in Python for the closed form of the method. This closed form is shown below:

Closed form

I have a training set X that is 100 rows x 10 columns and a vector y that is 100x1.

My attempt is as follows:

def ridgeRegression(xMatrix, yVector, lambdaRange):
    wList = []

    for i in range(1, lambdaRange+1):
        lambVal = i

        # compute the inner values (X.T X + lambda I)
        xTranspose = np.transpose(x)
        xTx = xTranspose @ x
        lamb_I = lambVal * np.eye(xTx.shape[0])

        # invert inner, e.g. (inner)**(-1)
        inner_matInv = np.linalg.inv(xTx + lamb_I)

        # compute outer (X.T y)
        outer_xTy = np.dot(xTranspose, y)

        # multiply together
        w = inner_matInv @ outer_xTy
        wList.append(w)

    print(wList)

For testing, I am running it with the first 5 lambda values. wList becomes 5 numpy.arrays each of length 10 (I'm assuming for the 10 coefficients).

Here is the first of those 5 arrays:

array([ 0.29686755,  1.48420319,  0.36388528,  0.70324668, -0.51604451,
        2.39045735,  1.45295857,  2.21437745,  0.98222546,  0.86124358])

My question, and clarification:

Shouldn't there be 11 coefficients, (1 for the y-intercept + 10 slopes)? How do I get the Minimum Square Error from this computation? What comes next if I wanted to plot this line?

I think I am just really confused as to what I'm looking at, since I'm still working on my linear-algebra.

Thanks!

like image 393
ptent Avatar asked Feb 19 '19 17:02

ptent


1 Answers

First, I would modify your ridge regression to look like the following:

import numpy as np
def ridgeRegression(X, y, lambdaRange):
    wList = []
    # Get normal form of `X`
    A = X.T @ X 
    # Get Identity matrix
    I = np.eye(A.shape[0])
    # Get right hand side
    c = X.T @ y
    for lambVal in range(1, lambdaRange+1):
        # Set up equations Bw = c        
        lamb_I = lambVal * I
        B = A + lamb_I
        # Solve for w
        w = np.linalg.solve(B,c)
        wList.append(w)        
    return wList

Notice that I replaced your inv call to compute the matrix inverse with an implicit solve. This is much more numerically stable, which is an important consideration for these types of problems especially.

I've also taken the A=X.T@X computation, identity matrix I generation, and right hand side vector c=X.T@y computation out of the loop--these don't change within the loop and are relatively expensive to compute.

As was pointed out by @qwr, the number of columns of X will determine the number of coefficients you have. You have not described your model, so it's not clear how the underlying domain, x, is structured into X.

Traditionally, one might use polynomial regression, in which case X is the Vandermonde Matrix. In that case, the first coefficient would be associated with the y-intercept. However, based on the context of your question, you seem to be interested in multivariate linear regression. In any case, the model needs to be clearly defined. Once it is, then the returned weights may be used to further analyze your data.

like image 100
jyalim Avatar answered Sep 19 '22 02:09

jyalim