Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding coefficients for logistic regression in python

I'm working on a classification problem and need the coefficients of the logistic regression equation. I can find the coefficients in R but I need to submit the project in python. I couldn't find the code for learning coefficients of logistic regression in python. How to get the coefficient values in python?

like image 770
MonkeyDLuffy Avatar asked Sep 13 '19 13:09

MonkeyDLuffy


Video Answer


3 Answers

sklearn.linear_model.LogisticRegression is for you. See this example:

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
clf = LogisticRegression(random_state=0).fit(X, y)

print(clf.coef_, clf.intercept_)
like image 188
Massifox Avatar answered Oct 20 '22 11:10

Massifox


The statsmodels library would give you a breakdown of the coefficient results, as well as the associated p-values to determine their significance.

Using an example of x1 and y1 variables:

x1_train, x1_test, y1_train, y1_test = train_test_split(x1, y1, random_state=0)

logreg = LogisticRegression().fit(x1_train,y1_train)
logreg

print("Training set score: {:.3f}".format(logreg.score(x1_train,y1_train)))
print("Test set score: {:.3f}".format(logreg.score(x1_test,y1_test)))

import statsmodels.api as sm
logit_model=sm.Logit(y1,x1)
result=logit_model.fit()
print(result.summary())

Example results:

Optimization terminated successfully.
         Current function value: 0.596755
         Iterations 7
                           Logit Regression Results                           
==============================================================================
Dep. Variable:             IsCanceled   No. Observations:                20000
Model:                          Logit   Df Residuals:                    19996
Method:                           MLE   Df Model:                            3
Date:                Sat, 17 Aug 2019   Pseudo R-squ.:                  0.1391
Time:                        23:58:55   Log-Likelihood:                -11935.
converged:                       True   LL-Null:                       -13863.
                                        LLR p-value:                     0.000
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const         -2.1417      0.050    -43.216      0.000      -2.239      -2.045
x1             0.0055      0.000     32.013      0.000       0.005       0.006
x2             0.0236      0.001     36.465      0.000       0.022       0.025
x3             2.1137      0.104     20.400      0.000       1.911       2.317
==============================================================================
like image 25
Michael Grogan Avatar answered Oct 20 '22 12:10

Michael Grogan


Have a look at the statsmodels library's Logit model.

You would use it like this:

from statsmodels.discrete.discrete_model import Logit
from statsmodels.tools import add_constant

x = [...] # Obesrvations
y = [...] # Response variable

x = add_constant(x)
print(Logit(y, x).fit().summary())
like image 25
Jan Morawiec Avatar answered Oct 20 '22 13:10

Jan Morawiec