Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Naming explanatory variables in regression output

Each one of my variables is a list on its own.

I am using a method found on another thread here.

import numpy as np
import statsmodels.api as sm

y = [1,2,3,4,3,4,5,4,5,5,4,5,4,5,4,5,6,5,4,5,4,3,4]

x = [
     [4,2,3,4,5,4,5,6,7,4,8,9,8,8,6,6,5,5,5,5,5,5,5],
     [4,1,2,3,4,5,6,7,5,8,7,8,7,8,7,8,7,7,7,7,7,6,5],
     [4,1,2,5,6,7,8,9,7,8,7,8,7,7,7,7,7,7,6,6,4,4,4]
     ]

def reg_m(y, x):
    ones = np.ones(len(x[0]))
    X = sm.add_constant(np.column_stack((x[0], ones)))
    for ele in x[1:]:
        X = sm.add_constant(np.column_stack((ele, X)))
    results = sm.OLS(y, X).fit()
    return results

My only problem being, that in my regression output, the explanatory variables are labelled x1, x2, x3 etc. Was wondering if it was possible to change these to more meaningful names?

Thanks

like image 209
aspiringcoderzzz Avatar asked Apr 12 '16 01:04

aspiringcoderzzz


People also ask

How do you name variables in regression?

The outcome variable is also called the response or dependent variable, and the risk factors and confounders are called the predictors, or explanatory or independent variables. In regression analysis, the dependent variable is denoted "Y" and the independent variables are denoted by "X".

How do you know which variable is the explanatory variable?

An explanatory variable is what you manipulate or observe changes in (e.g., caffeine dose), while a response variable is what changes as a result (e.g., reaction times). The words “explanatory variable” and “response variable” are often interchangeable with other terms used in research.

Which variable is the outcome for changes in the explanatory variable?

Response Variable is the result of the experiment where the explanatory variable is manipulated. It is a factor whose variation is explained by the other factors. Response Variable is often referred to as the Dependent Variable or the Outcome Variable.

Is a way to find out if explanatory variables in a model are significant?

To test the explanatory power of the whole set of explanatory variables, as compared to just using the overall mean of the outcome variable, use the F-statistic and the p-value printed by SPSS or Excel under “ANOVA.” If this p-value is less than 0.05, you can reject the null hypothesis (which is that all of the ...


1 Answers

Searching through the source, it appears the summary() method does support using your own names for explanatory variables. So:

results = sm.OLS(y, X).fit()
print results.summary(xname=['Fred', 'Mary', 'Ethel', 'Bob'])

gives us:

                                OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       0.535
Model:                            OLS   Adj. R-squared:                  0.461
Method:                 Least Squares   F-statistic:                     7.281
Date:                Mon, 11 Apr 2016   Prob (F-statistic):            0.00191
Time:                        22:22:47   Log-Likelihood:                -26.025
No. Observations:                  23   AIC:                             60.05
Df Residuals:                      19   BIC:                             64.59
Df Model:                           3
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
Fred           0.2424      0.139      1.739      0.098        -0.049     0.534
Mary           0.2360      0.149      1.587      0.129        -0.075     0.547
Ethel         -0.0618      0.145     -0.427      0.674        -0.365     0.241
Bob            1.5704      0.633      2.481      0.023         0.245     2.895
==============================================================================
Omnibus:                        6.904   Durbin-Watson:                   1.905
Prob(Omnibus):                  0.032   Jarque-Bera (JB):                4.708
Skew:                          -0.849   Prob(JB):                       0.0950
Kurtosis:                       4.426   Cond. No.                         38.6
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
like image 105
Gerrat Avatar answered Sep 21 '22 13:09

Gerrat