Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elastic net regression or lasso regression with weighted samples (sklearn)

Scikit-learn allows sample weights to be provided to linear, logistic, and ridge regressions (among others), but not to elastic net or lasso regressions. By sample weights, I mean each element of the input to fit on (and the corresponding output) is of varying importance, and should have an effect on the estimated coefficients proportional to its weight.

Is there a way I can manipulate my data before passing it to ElasticNet.fit() to incorporate my sample weights?

If not, is there a fundamental reason it is not possible?

Thanks!

like image 758
Albeit Avatar asked Oct 03 '17 04:10

Albeit


People also ask

Is elastic net better than lasso?

In terms of handling bias, Elastic Net is considered better than Ridge and Lasso regression, Small bias leads to the disturbance of prediction as it is dependent on a variable. Therefore Elastic Net is better in handling collinearity than the combined ridge and lasso regression.

Why is lasso regression better than linear regression?

Lasso is a modification of linear regression, where the model is penalized for the sum of absolute values of the weights. Thus, the absolute values of weight will be (in general) reduced, and many will tend to be zeros.

Is lasso better than regression?

Lasso regression is the best choice if we have a large amount of features we need to reduce the number of features in the model in order to simplify and make it more interpretable. Thank you for reading!

Why would you want to use lasso instead of ridge regression?

Lasso tends to do well if there are a small number of significant parameters and the others are close to zero (ergo: when only a few predictors actually influence the response). Ridge works well if there are many large parameters of about the same value (ergo: when most predictors impact the response).


1 Answers

You can read some discussion about this in sklearn's issue-tracker.

It basically reads like:

  • not that hard to do (theory-wise)
  • pain keeping all the basic sklearn'APIs and supporting all possible cases (dense vs. sparse)

As you can see in this thread and the linked one about adaptive lasso, there is not much activity there (probably because not many people care and the related paper is not popular enough; but that's only a guess).

Depending on your exact task (size? sparseness?), you could build your own optimizer quite easily based on scipy.optimize, supporting this kind of sample-weights (which will be a bit slower, but robust and precise)!

like image 175
sascha Avatar answered Oct 21 '22 05:10

sascha