Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Constrained Linear Regression in Python

Tags:

I have a classic linear regression problem of the form:

y = X b

where y is a response vector X is a matrix of input variables and b is the vector of fit parameters I am searching for.

Python provides b = numpy.linalg.lstsq( X , y ) for solving problems of this form.

However, when I use this I tend to get either extremely large or extremely small values for the components of b.

I'd like to perform the same fit, but constrain the values of b between 0 and 255.

It looks like scipy.optimize.fmin_slsqp() is an option, but I found it extremely slow for the size of problem I'm interested in (X is something like 3375 by 1500 and hopefully even larger).

  1. Are there any other Python options for performing constrained least squares fits?
  2. Or are there python routines for performing Lasso Regression or Ridge Regression or some other regression method which penalizes large b coefficient values?
like image 787
ulmangt Avatar asked Apr 14 '12 15:04

ulmangt


People also ask

What is a constrained linear regression?

Your constraint implies that you are regressing y on a single variable x1+x2 and forcing its coefficient to be 1. That doesn't solve the problem of errors in predictors. Errors in the dependent variable are what you expect with regression. โ€“ Nick Cox.

What does LinearRegression fit () do in Python?

Linear Regression Theory Linear regression performs the task to predict a dependent variable value (y) based on a given independent variable (x). So, this regression technique finds out a linear relationship between x (input) and y(output). Hence, the name is Linear Regression.

What is the formula for linear regression in Python?

When implementing linear regression of some dependent variable ๐‘ฆ on the set of independent variables ๐ฑ = (๐‘ฅโ‚, โ€ฆ, ๐‘ฅแตฃ), where ๐‘Ÿ is the number of predictors, you assume a linear relationship between ๐‘ฆ and ๐ฑ: ๐‘ฆ = ๐›ฝโ‚€ + ๐›ฝโ‚๐‘ฅโ‚ + โ‹ฏ + ๐›ฝแตฃ๐‘ฅแตฃ + ๐œ€. This equation is the regression equation.

Is Lasso regression linear?

Lasso is a modification of linear regression, where the model is penalized for the sum of absolute values of the weights. Thus, the absolute values of weight will be (in general) reduced, and many will tend to be zeros.


2 Answers

Recent scipy versions include a solver:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.lsq_linear.html#scipy.optimize.lsq_linear

like image 129
tillsten Avatar answered Sep 21 '22 19:09

tillsten


You mention you would find Lasso Regression or Ridge Regression acceptable. These and many other constrained linear models are available in the scikit-learn package. Check out the section on generalized linear models.

Usually constraining the coefficients involves some kind of regularization parameter (C or alpha)---some of the models (the ones ending in CV) can use cross validation to automatically set these parameters. You can also further constrain models to use only positive coefficents---for example, there is an option for this on the Lasso model.

like image 24
conradlee Avatar answered Sep 17 '22 19:09

conradlee