I'm using the scipy.stats.linregress function to do a simple linear regression on some 2D data, e.g.:
from scipy import stats
x = [5.05, 6.75, 3.21, 2.66]
y = [1.65, 26.5, -5.93, 7.96]
gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y)
The documentation on the function states that std_err
is the:
Standard error of the estimate
I'm not sure what this means. This old answer says that it represents the "standard error of the gradient line" but that this "was not always the behaviour of this library".
Could I get a precise definition of what exactly this parameter represent?
As of Dec 2016, I think that it's still showing the standard error of the slope of the OLS regression line. I calculated the regression of some datasets using orthogonal distance regression as part of the scipy package, and the output's sd_beta[1]
(representative of the standard error of the slope of the regression line) was very similar to the std_err
as calculated by scipy.stats.linregress.
This is a standard measure in statistics. See wikipedia for a description of how to compute it. Unfortunately, stackoverflow does not seem to have LaTeX support, so it does not make sense to write out and explain the equations here.
Essentially, std_err
should give a value for each coefficient represented in the gradient. In simple terms std_err
tells you how good of a fit the gradient is (higher values mean less precise) for your data.
Other useful answers on stats.stackexchange sites are here and here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With