Regression with Lasso, all coeffs are 0

Tags:

I am currently experimenting Lasso with scikit in the case of high dimension. The labels are Y_i (real numbers), and the feature are X_i (X_i is a vector of size d=112). I have only three couples of (Y_i,X_i).

d>>n=3 so we are in the high dimension case.

import numpy as np

Y = np.array([ 0.24186978,  0.20693342,  0.00441244])

X0 = np.array([ 0.49019359, -0.11332346,  0.46826879, -0.13540658,  0.37022392, -0.23379722,  0.37143564, -0.2329437 ,  0.37291492, -0.23186138, 0.37469679, -0.23055168,  0.30316716, -0.29125359,  0.30840626, -0.28652415,  0.44230139, -0.16121566,  0.42683712, -0.17683825, 0.32256713, -0.28145402,  0.3280964 , -0.27628293,  0.33245644, -0.27231986,  0.33670266, -0.26854582,  0.2643481 , -0.33007265, 0.27145917, -0.32347124,  0.3864629 , -0.21705415,  0.3808803 , -0.22279507,  0.27458751, -0.32943364,  0.28447461, -0.31990473, 0.2917428 , -0.3130335 ,  0.29848329, -0.30676519,  0.22697144, -0.36744932,  0.2357466 , -0.35918381,  0.32553467, -0.27798238, 0.33200664, -0.27166872,  0.22802673, -0.37599441,  0.24186978, -0.36250956,  0.25182545, -0.35295084,  0.26090483, -0.34434365, 0.19180827, -0.40261249,  0.20193396, -0.39299645,  0.26323078, -0.34028627,  0.28211954, -0.32155583,  0.18444715, -0.419574  , 0.20146085, -0.40291849,  0.21366417, -0.39111212,  0.2247606 , -0.38048788,  0.15946525, -0.43495551,  0.17055441, -0.424376  , 0.20348854, -0.40002851,  0.23321321, -0.37046216,  0.14509726, -0.45892388,  0.16422526, -0.44015407,  0.17807138, -0.42670492, 0.1907319 , -0.41451658,  0.13036714, -0.46405362,  0.14199556, -0.45293485,  0.14977732, -0.45373973,  0.18715638, -0.41651899, 0.11082473, -0.49319641,  0.13088375, -0.47349559,  0.145673  , -0.45910329,  0.15936004, -0.44588844,  0.10475443, -0.48966633, 0.11649699, -0.47843342])
X1 = np.array([ 0.08172583,  0.08172583,  0.12787895,  0.12787895,  0.17680895, 0.17680895,  0.20428698,  0.20428698,  0.22810783,  0.22810783, 0.24952302,  0.24952302,  0.25443032,  0.25443032,  0.27212382, 0.27212382,  0.09939284,  0.09939284,  0.14649492,  0.14649492, 0.18353275,  0.18353275,  0.21186616,  0.21186616,  0.23646753, 0.23646753,  0.25859485,  0.25859485,  0.25241207,  0.25241207, 0.27111512,  0.27111512,  0.11277054,  0.11277054,  0.16042754, 0.16042754,  0.18318121,  0.18318121,  0.21269144,  0.21269144, 0.23825706,  0.23825706,  0.26132525,  0.26132525,  0.24416304, 0.24416304,  0.26402983,  0.26402983,  0.11961642,  0.11961642, 0.16822144,  0.16822144,  0.17599107,  0.17599107,  0.20693342, 0.20693342,  0.23361131,  0.23361131,  0.25782472,  0.25782472, 0.23053159,  0.23053159,  0.2516101 ,  0.2516101 ,  0.11876227, 0.11876227,  0.16908658,  0.16908658,  0.16286772,  0.16286772, 0.19528754,  0.19528754,  0.22310772,  0.22310772,  0.24857796, 0.24857796,  0.21262181,  0.21262181,  0.23482641,  0.23482641, 0.11042389,  0.11042389,  0.16301827,  0.16301827,  0.14522374, 0.14522374,  0.17886349,  0.17886349,  0.20768069,  0.20768069, 0.23437567,  0.23437567,  0.19167763,  0.19167763,  0.21478313, 0.21478313,  0.09612585,  0.09612585,  0.15078275,  0.15078275, 0.1247584 ,  0.1247584 ,  0.15903691,  0.15903691,  0.18850909, 0.18850909,  0.21622738,  0.21622738,  0.16897004,  0.16897004, 0.1926264 ,  0.1926264 ])
X2 = np.array([ 0.0039031 ,  0.0039031 ,  0.00346908,  0.00346908,  0.00450824, 0.00450824,  0.00409751,  0.00409751,  0.0038224 ,  0.0038224 , 0.00358683,  0.00358683,  0.00393648,  0.00393648,  0.00374151, 0.00374151,  0.00488007,  0.00488007,  0.0040774 ,  0.0040774 , 0.00478876,  0.00478876,  0.00434275,  0.00434275,  0.0040458 , 0.0040458 ,  0.00379218,  0.00379218,  0.00397968,  0.00397968, 0.00379608,  0.00379608,  0.00568263,  0.00568263,  0.00457514, 0.00457514,  0.00488406,  0.00488406,  0.00444946,  0.00444946, 0.00415691,  0.00415691,  0.00390482,  0.00390482,  0.00391778, 0.00391778,  0.00375997,  0.00375997,  0.00617576,  0.00617576, 0.00490909,  0.00490909,  0.00478816,  0.00478816,  0.00441244, 0.00441244,  0.00415124,  0.00415124,  0.00392093,  0.00392093, 0.00375961,  0.00375961,  0.00363975,  0.00363975,  0.00627155, 0.00627155,  0.00504258,  0.00504258,  0.00451513,  0.00451513, 0.00423891,  0.00423891,  0.00403303,  0.00403303,  0.00384307, 0.00384307,  0.0035197 ,  0.0035197 ,  0.00344643,  0.00344643, 0.00595365,  0.00595365,  0.00496165,  0.00496165,  0.00409633, 0.00409633,  0.003947  ,  0.003947  ,  0.00381432,  0.00381432, 0.00367948,  0.00367948,  0.00321652,  0.00321652,  0.00319428, 0.00319428,  0.0052817 ,  0.0052817 ,  0.00467728,  0.00467728, 0.00357511,  0.00357511,  0.00356312,  0.00356312,  0.00351338, 0.00351338,  0.0034431 ,  0.0034431 ,  0.00287055,  0.00287055, 0.00289938,  0.00289938])
X = np.array([X0,X1,X2])

The data are such that the solution to the problem Y = X.theta exists, with theta being a vector of d dimension with all 0 and a one at index 54:

>>> Y
array([ 0.24186978,  0.20693342,  0.00441244])
>>> X[0, 54]
0.24186978045754323
>>> X[1, 54]
0.20693341629897405
>>> X[2, 54]
0.0044124449820170455

However when I apply Lasso it is no the expected result ... :

from sklearn.linear_model import Lasso
lasso = Lasso(alpha=0.1)
res = lasso.fit(X,Y)

Giving:

>>> res.coef_.tolist()
[0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0]

By changing the penalty coefficient:

lasso = Lasso(alpha=0.01)
res = lasso.fit(X,Y)

The result is still erroneous:

>>> res.coef_.tolist()
  [0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.24488850166974235, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0, 0.0, -0.0]

How could I retrieve the expected vector of coefficient?

403

asked Sep 23 '14 07:09

Colonel Beauvel

1 Answers

Lasso does not solve the l0-penalized least squares but instead l1-penalized least squares. The solution you get for alpha=0.01 is the Lasso solution (with a single non zero coef of ~0.245 for feature #10).

Even if your solution has a squared reconstruction error of 0.0, it still has a penalty of 1.0 (multiplied by alpha).

The solution for lasso with alpha=1.0 has a small squared reconstruction error of 0.04387 (divided by 2 * n_samples == 6) and a smaller l1 penalty of 0.245 (multiplied by alpha).

The objective function minimized by lasso is given in the docstring:

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html

To summarize the different priors (or penalties) commonly used to regularize least squares regression:

l2 penalty favors any number of non-zero coefficients but with very small absolute values (close to zero)
l1 penalty favors a small number of non-zero coefficients with small absolute values.
l0 favors a small number of non zero coefficients of any absolute value.

l0 being non-convex, it is often not as easy to optimize as l1 and l2. This is why people use l1 (lasso) or l1 + l2 (elastic net) in practice to find sparse solutions even if not as clean as l0.

125

answered Oct 14 '22 20:10

ogrisel

Related questions
                            
                                A new equation for intelligence: Difficulties with Entropy Maximization
                            
                                Python - Set zlim in mplot3D
                            
                                the trick to nested structures in pyparsing
                            
                                Can my use of paste0() in R be corrected so that this function runs as fast as the original Python example?
                            
                                Building standalone application with Cython + MinGW
                            
                                Removing runs from a 2D numpy array
                            
                                How to call python matplotlib in Qt C++ project?
                            
                                Conditional key in dictionary - python [duplicate]
                            
                                intersphinx link to pandas autodoc API
                            
                                Estimate formants using LPC in Python
                            
                                Pickling Selenium Webdriver Objects
                            
                                How to get cookies from urllib.request?
                            
                                Pandas to_csv always substitute long numpy.ndarray with ellipsis
                            
                                Memoizing SQL queries
                            
                                Why doesn't sys.excepthook work?
                            
                                Get column names (headers) from hdf file
                            
                                Store and extract numpy datetimes in PyTables
                            
                                Updating millions of rows in MySQL -- when to commit
                            
                                Remove frame around legend in Python Object-Oriented plotting
                            
                                Running a Flask application at a URL that is not the domain root

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regression with Lasso, all coeffs are 0

Tags:

python

scikit-learn

Colonel Beauvel

People also ask

1 Answers

ogrisel

Recent Activity

Donate For Us