Python : How to interpret the result of logistic regression by sm.Logit

Tags:

When I run a logistic regression by sm.Logit (in the statsmodel library), part of the result is like this:

Pseudo R-squ.: 0.4335

Log-Likelihood: -291.08

LL-Null: -513.87

LLR p-value: 2.978e-96

How could I explain the significance of the model? Or say, the ability of explaining? Which indicator should I use? I have searched online and there isn't much information about Pseudo R2 and LLR pvalue. I'm confused that how I can say that my model is good.

621

asked Oct 12 '17 02:10

R.Yan

1 Answers

From Hands-On Machine Learning for Algorithmic Trading:

Log-Likelihood: this is the maximized value of the log-likelihood function.

LL-Null: this is the result of the maximized log-likelihood function when only an intercept is included. It forms the basis for the pseudo- $R^2$ statistic and the Log-Likelihood Ratio (LRR) test (see below)

pseudo- $R^2$ : this is a substitute of the familiar $R^2$ available under least squares. It is computed based on the ratio of the maximized log-likelihood function for the null model m0 and the full model m1 as follows:

pseudo-R^2
_{(source: googleapis.com)}

The values vary from 0 (when the model does not improve the likelihood) to 1 (where the model fits perfectly and the log-likelihood is maximized at 0). Consquently, higher values indicate a better fit.

LLR: The LLR test generally compares a more restricted model and is computed as:

$\mathrm{LLR} = -2 \log(\frac{\mathcal{L}(m_0^*)}{\mathcal{L}(m_1^*)}) = 2(\log \mathcal{L}(m_1^*) - \log \mathcal{L}(m_0^*))$

The null hypothesis is that the restricted model performs better but a low p-value suggests that we can reject this hypothesis and prefer the full model over the null model. This is similar to the F-test for linear regression (where can also use the LLR test when we estimate the model using MLE).

z-statistic: plays the same role as the t-statistic in the linear regression output and is equally computed as the ratio of the coefficient estimate and its standard error.

p-values: these indicate the probability of observing the test statistic assuming the null hypothesis $H_0: \beta = 0$ that the population coefficient is zero.

As you can see (and the way I understand it), many of these metrics are counterparts to those of the linear regression case. Furthermore, as Rose already point out, I would recommend checking the statsmodel documentation.

164

answered Oct 01 '22 21:10

Arturo Moncada-Torres

Related questions
                            
                                Creating suitable WAV files for Google Speech API
                            
                                Working Example Of Luminol Anomaly Detection And Correlation Library By Linkedin
                            
                                OpenCV Python Feature Detection: how to provide a mask? (SIFT)
                            
                                When is type(instance) different from instance.__class__?
                            
                                Importing external treebank-style BLLIP corpus using NLTK
                            
                                pymongo.errors.BulkWriteError: batch op errors occurred (MongoDB 3.4.2, pymongo 3.4.0, python 2.7.13)
                            
                                Why `__iter__` does not work when defined as an instance variable?
                            
                                Import forked module in Python instead of installed module
                            
                                What is a good alternative to Firebase for user management, more specifically for Python?
                            
                                Python pandas datareader isn't working [closed]
                            
                                Difference Between Keras Input Layer and Tensorflow Placeholders
                            
                                'module' object has no attribute 'feature_column'
                            
                                OpenAI gym mujoco ImportError: No module named 'mujoco_py.mjlib'
                            
                                Flask tutorial: Why do we use the app context for the DB connection?
                            
                                Scrapy CrawlSpider + Splash: how to follow links through linkextractor?
                            
                                FastText - Cannot load model.bin due to C++ extension failed to allocate the memory
                            
                                Why does df.apply(tuple) work but not df.apply(list)?
                            
                                Finding the union of multiple overlapping rectangles - OpenCV python
                            
                                Is it possible to parallelize bz2's decompression?
                            
                                mypy: Signature of "__getitem__" incompatible with supertype "Sequence"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python : How to interpret the result of logistic regression by sm.Logit

Tags:

python

statistics

regression

R.Yan

People also ask

1 Answers

Arturo Moncada-Torres

Recent Activity

Donate For Us