I would like to compute the beta or standardized coefficient of a linear regression model using standard tools in Python (numpy, pandas, scipy.stats, etc.). A friend of mine told me that this is done in R with the following command: <pre class="prettyprint"><code>lm(scale(y) ~ scale(x)) </code></pre> Currently, I am computing it in Python like this: <pre class="prettyprint"><code>from scipy.stats import linregress from scipy.stats.mstats import zscore (beta_coeff, intercept, rvalue, pvalue, stderr) = linregress(zscore(x), zscore(y)) print('The Beta Coeff is: %f' % beta_coeff) </code></pre> Is there a more straightforward function to compute this figure in Python?

Python is a general purpose language, but R was designed specifically for statistics. It's almost always going to take a few more lines of code to achieve the same (statistical) goal in python, purely because R comes ready to fit regression models (using <code>lm</code>) as soon as you boot it up. The short answer to your question is No - your python code is already pretty straightforward. That said, I think a closer equivalent to your R code would be <pre class="prettyprint"><code>import statsmodels.api as sm from scipy.stats.mstats import zscore print sm.OLS(zscore(y), zscore(x)).fit().summary() </code></pre>

Compute linear regression standardized coefficient (beta) with Python

Tags:

python

numpy

statistics

scipy

linear-regression

I would like to compute the beta or standardized coefficient of a linear regression model using standard tools in Python (numpy, pandas, scipy.stats, etc.).

A friend of mine told me that this is done in R with the following command:

lm(scale(y) ~ scale(x))

Currently, I am computing it in Python like this:

from scipy.stats import linregress
from scipy.stats.mstats import zscore

(beta_coeff, intercept, rvalue, pvalue, stderr) = linregress(zscore(x), zscore(y))
print('The Beta Coeff is: %f' % beta_coeff)

Is there a more straightforward function to compute this figure in Python?

360

asked Nov 25 '15 10:11

David

1 Answers

Python is a general purpose language, but R was designed specifically for statistics. It's almost always going to take a few more lines of code to achieve the same (statistical) goal in python, purely because R comes ready to fit regression models (using lm) as soon as you boot it up.

The short answer to your question is No - your python code is already pretty straightforward.

That said, I think a closer equivalent to your R code would be

import statsmodels.api as sm
from scipy.stats.mstats import zscore

print sm.OLS(zscore(y), zscore(x)).fit().summary()

105

answered Oct 22 '22 13:10

Eoin

Related questions
                            
                                Limiting number of input values in an array/list in Python
                            
                                Run Celery Worker from FLASK app
                            
                                understanding '*' "keyword only" argument notation in python3 functions [duplicate]
                            
                                How to delete a locked (flock) file without race condition: before or after releasing the lock?
                            
                                SymPy -- define domain of variable
                            
                                Python/tkinter - How do I get the window size including borders on Windows?
                            
                                Keeping rows in Pandas where the same ID appears more than n times and convert to list per ID
                            
                                Pandas convert mixed types to string
                            
                                Copying one file to multiple remote hosts in parallel over SFTP
                            
                                Will dict(**kwargs) always give dictionary where Keys are of type string?
                            
                                Python Socket - Send/Receive messages at the same time
                            
                                Ack number to acknowledge data in scapy
                            
                                docker with pycharm 5
                            
                                Storage of timeseries data in python
                            
                                ImportError: libcudart.so.7.0: cannot open shared object file: No such file or directory
                            
                                Python 2D list slice
                            
                                Python: Converting from binary to String
                            
                                urllib3 - Failed to establish a new connection: [Errno 111]
                            
                                Interpreting feature importance values from a RandomForestClassifier
                            
                                How does numpy.fft.fft work?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With