sklearn provides LASSO method for regression estimation. However, when I try to fit LassoCV(X,y) with y a matrix, it throws an error. See screenshot below, and the link for their documentation. The sklearn version I am using is 0.15.2.
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LassoCV.html#sklearn.linear_model.LassoCV
Its document says y can be a ndarray:
y : array-like, shape (n_samples,) or (n_samples, n_targets)
When I use just Lasso() to fit the same X, and y, it works fine. So I wonder if the LassoCV() is broken or I need to do something else?
In [2]: import numpy as np
im
In [3]: import sklearn.linear_model
In [4]: from sklearn import linear_model
In [5]: X = np.random.random((10,100))
In [6]: y = np.random.random((50, 100))
In [7]: linear_model.Lasso().fit(X,y)
Out[7]:
Lasso(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=1000,
normalize=False, positive=False, precompute='auto', tol=0.0001,
warm_start=False)
In [8]: linear_model.LassoCV().fit(X,y)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-8-9c8ad3459ac8> in <module>()
----> 1 linear_model.LassoCV().fit(X,y)
/chimerahomes/wenhoujx/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/sklearn/linear_model/coordinate_descent.pyc in fit(self, X, y)
1006 if y.ndim > 1:
1007 raise ValueError("For multi-task outputs, use "
-> 1008 "MultiTask%sCV" % (model_str))
1009 else:
1010 if sparse.isspmatrix(X):
ValueError: For multi-task outputs, use MultiTaskLassoCV
In [9]:
It seems that ElasticCV() and Elastic() pair has the same situation, the former() suggest to use multitask-ElasticCV() and the latter works fine for 2d matrix.
Contrary to what is written in some docstrings, the normal lasso estimators, such as the one you are using, do not support multiple targets.
The error message is telling you to use MultiTaskLasso
which is a type of group lasso, which forces the same sparse support for every target. If this is what you need, go ahead and use it. If not, as of now, there is no other useful way than to loop across targets, which you can embarrassingly parallelize using sklearn.externals.joblib
.
(If you feel like contributing multi target support for independent targets, a pull request on github would be very much welcomed.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With