Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sklearn, LassoCV() and ElasticCV() broken?

sklearn provides LASSO method for regression estimation. However, when I try to fit LassoCV(X,y) with y a matrix, it throws an error. See screenshot below, and the link for their documentation. The sklearn version I am using is 0.15.2.

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LassoCV.html#sklearn.linear_model.LassoCV

Its document says y can be a ndarray:

y : array-like, shape (n_samples,) or (n_samples, n_targets)

When I use just Lasso() to fit the same X, and y, it works fine. So I wonder if the LassoCV() is broken or I need to do something else?

In [2]:  import numpy as np 
im
In [3]: import sklearn.linear_model

In [4]: from sklearn import linear_model

In [5]: X = np.random.random((10,100))

In [6]: y = np.random.random((50, 100)) 

In [7]: linear_model.Lasso().fit(X,y) 
Out[7]: 
Lasso(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=1000,
   normalize=False, positive=False, precompute='auto', tol=0.0001,
   warm_start=False)

In [8]: linear_model.LassoCV().fit(X,y)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-9c8ad3459ac8> in <module>()
----> 1 linear_model.LassoCV().fit(X,y)

/chimerahomes/wenhoujx/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/sklearn/linear_model/coordinate_descent.pyc in fit(self, X, y)
   1006             if y.ndim > 1:
   1007                 raise ValueError("For multi-task outputs, use "
-> 1008                                  "MultiTask%sCV" % (model_str))
   1009         else:
   1010             if sparse.isspmatrix(X):

ValueError: For multi-task outputs, use MultiTaskLassoCV

In [9]: 

It seems that ElasticCV() and Elastic() pair has the same situation, the former() suggest to use multitask-ElasticCV() and the latter works fine for 2d matrix.

like image 935
fast tooth Avatar asked Oct 18 '14 21:10

fast tooth


1 Answers

Contrary to what is written in some docstrings, the normal lasso estimators, such as the one you are using, do not support multiple targets.

The error message is telling you to use MultiTaskLasso which is a type of group lasso, which forces the same sparse support for every target. If this is what you need, go ahead and use it. If not, as of now, there is no other useful way than to loop across targets, which you can embarrassingly parallelize using sklearn.externals.joblib.

(If you feel like contributing multi target support for independent targets, a pull request on github would be very much welcomed.)

like image 51
eickenberg Avatar answered Sep 27 '22 02:09

eickenberg