confidence and prediction intervals with StatsModels

Tags:

I do this linear regression with StatsModels:

import numpy as np import statsmodels.api as sm from statsmodels.sandbox.regression.predstd import wls_prediction_std  n = 100  x = np.linspace(0, 10, n) e = np.random.normal(size=n) y = 1 + 0.5*x + 2*e X = sm.add_constant(x)  re = sm.OLS(y, X).fit() print(re.summary())  prstd, iv_l, iv_u = wls_prediction_std(re)

My questions are, iv_l and iv_u are the upper and lower confidence intervals or prediction intervals?

How I get others?

I need the confidence and prediction intervals for all points, to do a plot.

585

asked Jul 09 '13 22:07

F.N.B

2 Answers

For test data you can try to use the following.

predictions = result.get_prediction(out_of_sample_df) predictions.summary_frame(alpha=0.05)

I found the summary_frame() method buried here and you can find the get_prediction() method here. You can change the significance level of the confidence interval and prediction interval by modifying the "alpha" parameter.

I am posting this here because this was the first post that comes up when looking for a solution for confidence & prediction intervals – even though this concerns itself with test data rather.

Here's a function to take a model, new data, and an arbitrary quantile, using this approach:

def ols_quantile(m, X, q):   # m: OLS model.   # X: X matrix.   # q: Quantile.   #   # Set alpha based on q.   a = q * 2   if q > 0.5:     a = 2 * (1 - q)   predictions = m.get_prediction(X)   frame = predictions.summary_frame(alpha=a)   if q > 0.5:     return frame.obs_ci_upper   return frame.obs_ci_lower

answered Sep 20 '22 13:09

Julius

update see the second answer which is more recent. Some of the models and results classes have now a get_prediction method that provides additional information including prediction intervals and/or confidence intervals for the predicted mean.

old answer:

iv_l and iv_u give you the limits of the prediction interval for each point.

Prediction interval is the confidence interval for an observation and includes the estimate of the error.

I think, confidence interval for the mean prediction is not yet available in statsmodels. (Actually, the confidence interval for the fitted values is hiding inside the summary_table of influence_outlier, but I need to verify this.)

Proper prediction methods for statsmodels are on the TODO list.

Addition

Confidence intervals are there for OLS but the access is a bit clumsy.

To be included after running your script:

from statsmodels.stats.outliers_influence import summary_table  st, data, ss2 = summary_table(re, alpha=0.05)  fittedvalues = data[:, 2] predict_mean_se  = data[:, 3] predict_mean_ci_low, predict_mean_ci_upp = data[:, 4:6].T predict_ci_low, predict_ci_upp = data[:, 6:8].T  # Check we got the right things print np.max(np.abs(re.fittedvalues - fittedvalues)) print np.max(np.abs(iv_l - predict_ci_low)) print np.max(np.abs(iv_u - predict_ci_upp))  plt.plot(x, y, 'o') plt.plot(x, fittedvalues, '-', lw=2) plt.plot(x, predict_ci_low, 'r--', lw=2) plt.plot(x, predict_ci_upp, 'r--', lw=2) plt.plot(x, predict_mean_ci_low, 'r--', lw=2) plt.plot(x, predict_mean_ci_upp, 'r--', lw=2) plt.show()

enter image description here

This should give the same results as SAS, http://jpktd.blogspot.ca/2012/01/nice-thing-about-seeing-zeros.html

answered Sep 19 '22 13:09

Josef

Related questions
                            
                                Django : Can't import 'module'. Check that module AppConfig.name is correct
                            
                                py2exe fails to generate an executable
                            
                                Django admin and showing thumbnail images
                            
                                Append an empty row in dataframe using pandas
                            
                                Passing default list argument to dataclasses
                            
                                Need to add space between SubPlots for X axis label, maybe remove labelling of axis notches
                            
                                Catching a 500 server error in Flask
                            
                                What's the purpose of the "__package__" attribute in Python?
                            
                                BeatifulSoup4 get_text still has javascript
                            
                                Visual Studio Code - How to add multiple paths to python path?
                            
                                How to get a list of built-in modules in python?
                            
                                Python: Read several json files from a folder
                            
                                preprocess_input() method in keras
                            
                                How to customize the auth.User Admin page in Django CRUD?
                            
                                Creating HTML in python
                            
                                plotting results of hierarchical clustering ontop of a matrix of data in python
                            
                                Postpone code for later execution in python (like setTimeout in javascript) [duplicate]
                            
                                How to add column to numpy array
                            
                                Unsupported operation :not writeable python
                            
                                syntax error when using command line in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

confidence and prediction intervals with StatsModels

Tags:

python

statistics

statsmodels

F.N.B

People also ask

2 Answers

Julius

Josef

Recent Activity

Donate For Us