Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform logistic lasso in python?

The scikit-learn package provides the functions Lasso() and LassoCV() but no option to fit a logistic function instead of a linear one...How to perform logistic lasso in python?

like image 939
Fringant Avatar asked Jan 13 '17 16:01

Fringant


People also ask

Can you use Lasso for logistic regression Python?

We can use LASSO to improve overfitting in models by selecting features. It works with Linear Regression, Logistic Regression and several other models. Essentially, if the model has coefficients, LASSO can be used.

How do I run a lasso in Python?

In Python, Lasso regression can be performed using the Lasso class from the sklearn. linear_model library. The Lasso class takes in a parameter called alpha which represents the strength of the regularization term. A higher alpha value results in a stronger penalty, and therefore fewer features being used in the model.

Is lasso a logistic regression?

LASSO is a penalized regression approach that estimates the regression coefficients by maximizing the log-likelihood function (or the sum of squared residuals) with the constraint that the sum of the absolute values of the regression coefficients, ∑ j = 1 k β j , is less than or equal to a positive constant s.


2 Answers

The Lasso optimizes a least-square problem with a L1 penalty. By definition you can't optimize a logistic function with the Lasso.

If you want to optimize a logistic function with a L1 penalty, you can use the LogisticRegression estimator with the L1 penalty:

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
log = LogisticRegression(penalty='l1', solver='liblinear')
log.fit(X, y)

Note that only the LIBLINEAR and SAGA (added in v0.19) solvers handle the L1 penalty.

like image 74
TomDLT Avatar answered Sep 17 '22 12:09

TomDLT


1 scikit-learn: sklearn.linear_model.LogisticRegression

sklearn.linear_model.LogisticRegression from scikit-learn is probably the best:

as @TomDLT said, Lasso is for the least squares (regression) case, not logistic (classification).

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(
    penalty='l1',
    solver='saga',  # or 'liblinear'
    C=regularization_strength)

model.fit(x, y)

2 python-glmnet: glmnet.LogitNet

You can also use Civis Analytics' python-glmnet library. This implements the scikit-learn BaseEstimator API:

# source: https://github.com/civisanalytics/python-glmnet#regularized-logistic-regression

from glmnet import LogitNet

m = LogitNet(
    alpha=1,  # 0 <= alpha <= 1, 0 for ridge, 1 for lasso
)
m = m.fit(x, y)

I'm not sure how to adjust the penalty with LogitNet, but I'll let you figure that out.

3 other

PyMC

you can also take a fully bayesian approach. rather than use L1-penalized optimization to find a point estimate for your coefficients, you can approximate the distribution of your coefficients given your data. this gives you the same answer as L1-penalized maximum likelihood estimation if you use a Laplace prior for your coefficients. the Laplace prior induces sparsity.

the PyMC folks have a tutorial here on setting something like that up. good luck.

like image 27
grisaitis Avatar answered Sep 18 '22 12:09

grisaitis