Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linear Regression with Python numpy

I'm trying to make a simple linear regression function but continue to encounter a

numpy.linalg.linalg.LinAlgError: Singular matrix error

Existing function (with debug prints):

def makeLLS(inputData, targetData):
    print "In makeLLS:"
    print "    Shape inputData:",inputData.shape
    print "    Shape targetData:",targetData.shape
    term1 = np.dot(inputData.T, inputData)
    term2 = np.dot(inputData.T, targetData)
    print "    Shape term1:",term1.shape
    print "    Shape term2:",term2.shape
    #print term1
    #print term2
    result = np.linalg.solve(term1, term2)
    return result

The output to the console with my test data is:

In makeLLS:
    Shape trainInput1: (773, 10)
    Shape trainTargetData: (773, 1)
    Shape term1: (10, 10)
    Shape term2: (10, 1)

Then it errors on the linalg.solve line. This is a textbook linear regression function and I can't seem to figure out why it's failing.

What is the singular matrix error?

like image 498
Jonathan Avatar asked Oct 13 '10 03:10

Jonathan


People also ask

Is Python good for linear regression?

Understanding how to implement linear regression models can unearth stories in data to solve important problems. We'll use Python as it is a robust tool to handle, process, and model data. It has an array of packages for linear regression modelling.

Is Polyfit same as linear regression?

Both models uses Least Squares, but the equation on which these Least Squares are used is completely different. polyfit applies it on the vandemonde matrix while the linear regression does not.

What does LinearRegression fit () do in Python?

LinearRegression fits a linear model with coefficients w = (w1, …, wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation. Whether to calculate the intercept for this model.


2 Answers

As explained in the other answer linalg.solve expects a full rank matrix. This is because it tries to solve a matrix equation rather than do linear regression which should work for all ranks.

There are a few methods for linear regression. The simplest one I would suggest is the standard least squares method. Just use numpy.linalg.lstsq instead. The documentation including an example is here.

like image 148
Muhammad Alkarouri Avatar answered Oct 24 '22 08:10

Muhammad Alkarouri


A singular matrix is one for which the determinant is zero. This indicates that your matrix has rows that aren't linearly independent. For instance, if one of the rows is not linearly independent of the others, then it can be constructed by a linear combination of the other rows. I'll use numpy's linalg.solve example to demonstrate. Here is the doc's example:

>>> import numpy as np
>>> a = np.array([[3,1], [1,2]])
>>> b = np.array([9,8])
>>> x = np.linalg.solve(a, b)
>>> x
array([ 2.,  3.])

Now, I'll change a to make it singular.

>>> a = np.array([[2,4], [1,2]])
>>> x = np.linalg.solve(a, b)
...
LinAlgError: Singular matrix

This is a very obvious example because the first row is just double the second row, but hopefully you get the point.

like image 8
Justin Peel Avatar answered Oct 24 '22 07:10

Justin Peel