Naming explanatory variables in regression output

Tags:

Each one of my variables is a list on its own.

I am using a method found on another thread here.

import numpy as np
import statsmodels.api as sm

y = [1,2,3,4,3,4,5,4,5,5,4,5,4,5,4,5,6,5,4,5,4,3,4]

x = [
     [4,2,3,4,5,4,5,6,7,4,8,9,8,8,6,6,5,5,5,5,5,5,5],
     [4,1,2,3,4,5,6,7,5,8,7,8,7,8,7,8,7,7,7,7,7,6,5],
     [4,1,2,5,6,7,8,9,7,8,7,8,7,7,7,7,7,7,6,6,4,4,4]
     ]

def reg_m(y, x):
    ones = np.ones(len(x[0]))
    X = sm.add_constant(np.column_stack((x[0], ones)))
    for ele in x[1:]:
        X = sm.add_constant(np.column_stack((ele, X)))
    results = sm.OLS(y, X).fit()
    return results

My only problem being, that in my regression output, the explanatory variables are labelled x1, x2, x3 etc. Was wondering if it was possible to change these to more meaningful names?

Thanks

209

asked Apr 12 '16 01:04

aspiringcoderzzz

1 Answers

Searching through the source, it appears the summary() method does support using your own names for explanatory variables. So:

results = sm.OLS(y, X).fit()
print results.summary(xname=['Fred', 'Mary', 'Ethel', 'Bob'])

gives us:

                                OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.535
Model:                            OLS   Adj. R-squared:                  0.461
Method:                 Least Squares   F-statistic:                     7.281
Date:                Mon, 11 Apr 2016   Prob (F-statistic):            0.00191
Time:                        22:22:47   Log-Likelihood:                -26.025
No. Observations:                  23   AIC:                             60.05
Df Residuals:                      19   BIC:                             64.59
Df Model:                           3
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
Fred           0.2424      0.139      1.739      0.098        -0.049     0.534
Mary           0.2360      0.149      1.587      0.129        -0.075     0.547
Ethel         -0.0618      0.145     -0.427      0.674        -0.365     0.241
Bob            1.5704      0.633      2.481      0.023         0.245     2.895
==============================================================================
Omnibus:                        6.904   Durbin-Watson:                   1.905
Prob(Omnibus):                  0.032   Jarque-Bera (JB):                4.708
Skew:                          -0.849   Prob(JB):                       0.0950
Kurtosis:                       4.426   Cond. No.                         38.6
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

105

answered Sep 21 '22 13:09

Gerrat

Related questions
                            
                                python - name of np array variable as string
                            
                                Can I make a file optional based on a variable's value in cookiecutter.json
                            
                                python elasticsearch-dsl parent child relationship
                            
                                Python: Open multiple images in default image viewer
                            
                                How could I pass block to a function in Python which is like the way to pass block in Ruby
                            
                                Select the first item from a drop down by index is not working. Unbound method select_by_index
                            
                                GVIM crashes when running python
                            
                                Unable to import sendgrid into GAE application
                            
                                How to assign a value_count output to a dataframe
                            
                                How to set set JSON encoder in marshmallow?
                            
                                Python find difference between file paths
                            
                                Make IPython Import What I Mean
                            
                                How To Use Django Cycle Tag
                            
                                CountVectorizer(analyzer='char_wb') not working as expected
                            
                                Flask-Admin Role Based Access - Modify access based on role
                            
                                Django Rest Framework doesn't serialize SerializerMethodField
                            
                                pycorenlp: "CoreNLP request timed out. Your document may be too long"
                            
                                SIGINT and exception handling in Python
                            
                                Pandas - create boolean columns from categorical column
                            
                                How to group mutliple modules into a single namespace?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Naming explanatory variables in regression output

Tags:

python

statsmodels

regression

aspiringcoderzzz

People also ask

1 Answers

Gerrat

Recent Activity

Donate For Us