Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compute linear regression standardized coefficient (beta) with Python

I would like to compute the beta or standardized coefficient of a linear regression model using standard tools in Python (numpy, pandas, scipy.stats, etc.).

A friend of mine told me that this is done in R with the following command:

lm(scale(y) ~ scale(x))

Currently, I am computing it in Python like this:

from scipy.stats import linregress
from scipy.stats.mstats import zscore

(beta_coeff, intercept, rvalue, pvalue, stderr) = linregress(zscore(x), zscore(y))
print('The Beta Coeff is: %f' % beta_coeff)

Is there a more straightforward function to compute this figure in Python?

like image 360
David Avatar asked Nov 25 '15 10:11

David


People also ask

How do you find the standardized beta coefficient in Python?

You just need to standardize your original DataFrame using a z distribution (i.e., z-score) first and then perform a linear regression. Now, the coef will show you the standardized (beta) coefficients so that you can compare their influence on your dependent variable. Notes: Please keep in mind that you need .

How do you calculate standardized beta in regression?

Betas are calculated by subtracting the mean from the variable and dividing by its standard deviation. This results in standardized variables having a mean of zero and a standard deviation of 1.

How do you calculate the standardized beta coefficient?

The standardized coefficient is found by multiplying the unstandardized coefficient by the ratio of the standard deviations of the independent variable and dependent variable.

How are beta coefficients calculated in linear regression?

Beta coefficients from regression coefficients The x and y refer to the predictor and response variables. You therefore take the standard deviation of the predictor variable, divide by the standard deviation of the response and multiply by the regression coefficient for the predictor under consideration.


1 Answers

Python is a general purpose language, but R was designed specifically for statistics. It's almost always going to take a few more lines of code to achieve the same (statistical) goal in python, purely because R comes ready to fit regression models (using lm) as soon as you boot it up.

The short answer to your question is No - your python code is already pretty straightforward.

That said, I think a closer equivalent to your R code would be

import statsmodels.api as sm
from scipy.stats.mstats import zscore

print sm.OLS(zscore(y), zscore(x)).fit().summary()
like image 105
Eoin Avatar answered Oct 22 '22 13:10

Eoin