I am having trouble understanding the output of my function to implement multiple-ridge regression. I am doing this from scratch in Python for the closed form of the method. This closed form is shown below:
I have a training set X
that is 100 rows x 10 columns
and a vector y
that is 100x1
.
My attempt is as follows:
def ridgeRegression(xMatrix, yVector, lambdaRange):
wList = []
for i in range(1, lambdaRange+1):
lambVal = i
# compute the inner values (X.T X + lambda I)
xTranspose = np.transpose(x)
xTx = xTranspose @ x
lamb_I = lambVal * np.eye(xTx.shape[0])
# invert inner, e.g. (inner)**(-1)
inner_matInv = np.linalg.inv(xTx + lamb_I)
# compute outer (X.T y)
outer_xTy = np.dot(xTranspose, y)
# multiply together
w = inner_matInv @ outer_xTy
wList.append(w)
print(wList)
For testing, I am running it with the first 5 lambda values.
wList
becomes 5 numpy.arrays
each of length 10 (I'm assuming for the 10 coefficients).
Here is the first of those 5 arrays:
array([ 0.29686755, 1.48420319, 0.36388528, 0.70324668, -0.51604451,
2.39045735, 1.45295857, 2.21437745, 0.98222546, 0.86124358])
My question, and clarification:
Shouldn't there be 11 coefficients, (1 for the y-intercept + 10 slopes)?
How do I get the Minimum Square Error
from this computation?
What comes next if I wanted to plot this line?
I think I am just really confused as to what I'm looking at, since I'm still working on my linear-algebra.
Thanks!
First, I would modify your ridge regression to look like the following:
import numpy as np
def ridgeRegression(X, y, lambdaRange):
wList = []
# Get normal form of `X`
A = X.T @ X
# Get Identity matrix
I = np.eye(A.shape[0])
# Get right hand side
c = X.T @ y
for lambVal in range(1, lambdaRange+1):
# Set up equations Bw = c
lamb_I = lambVal * I
B = A + lamb_I
# Solve for w
w = np.linalg.solve(B,c)
wList.append(w)
return wList
Notice that I replaced your inv
call to compute the matrix inverse with an implicit solve. This is much more numerically stable, which is an important consideration for these types of problems especially.
I've also taken the A=X.T@X
computation, identity matrix I
generation, and right hand side vector c=X.T@y
computation out of the loop--these don't change within the loop and are relatively expensive to compute.
As was pointed out by @qwr, the number of columns of X
will determine the number of coefficients you have. You have not described your model, so it's not clear how the underlying domain, x
, is structured into X
.
Traditionally, one might use polynomial regression, in which case X
is the Vandermonde Matrix. In that case, the first coefficient would be associated with the y-intercept. However, based on the context of your question, you seem to be interested in multivariate linear regression. In any case, the model needs to be clearly defined. Once it is, then the returned weights may be used to further analyze your data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With