Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logistic Regression with sklearn

Not sure if this is a great place for this question, but I was told CrossValidated was not. So, all these questions refer to sklearn, but if you have insights into logistic regression in general, I'd love to hear them as well.

1) Does data have to be standardizes(mean 0, stdev 1)?
2) In sklearn, how do I specify what kind of regularization I want (L1 vs L2)? Note that this is different from penalty; penalty refers to classification error, not pentalty on coefficients.
3) How can I use to also do variable selection? I.e., analogously to lasso for linear regression.
4) When using regularization, how do I optimize for C, the regularization strength? Is there something built-in, or do I have to take care of this myself?

Probably an example would be most helpful, but I'd appreciate any insights on any of these questions.

This has been my starting point: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

Thank you very much in advance!

like image 690
Baron Yugovich Avatar asked Sep 22 '15 18:09

Baron Yugovich


People also ask

What is logistic regression sklearn?

Photo Credit: Scikit-Learn. Logistic Regression is a Machine Learning classification algorithm that is used to predict the probability of a categorical dependent variable. In logistic regression, the dependent variable is a binary variable that contains data coded as 1 (yes, success, etc.) or 0 (no, failure, etc.).

What is Max_iter in sklearn?

max_iterint, default=100. Maximum number of iterations taken for the solvers to converge.


1 Answers

1) For logistic regression, no. You are not computing distances between instances.

2) You can specify the penalty='l1' or penalty='l2' parameter. See the LogisticRegression page. L2 penalty is default.

3) There are various explicit feature selection techniques that scikit-learn provides, e.g. using SelectKBest with a chi2 ranking function.

4) You will want to do a Grid Search for the optimal parameter.

For more detail on all these questions, I suggest going through some of the Examples, e.g. this one and this one.

like image 95
Ansari Avatar answered Oct 01 '22 18:10

Ansari