I need some help using the scipy.stats.t.interval() function
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html?highlight=stats.t#scipy.stats.t
I am looking at the documentation, and it doesn't make sense. What are loc and scale? I'm used to student T intervals requiring a mean, sd, df, and confidence interval.
If you know the answer and can help, please post. Also if you could tell me how you learned it, that would be great. I've been having no luck with this documentation.
A Student's t continuous random variable. For the noncentral t distribution, see nct . As an instance of the rv_continuous class, t object inherits from it a collection of generic methods (see below for the full list), and completes them with details specific for this particular distribution. See also nct. Notes.
How to Calculate P-Values Using t Distribution. We can use the t. cdf(x, df, loc=0, scale=1) function to find the p-value associated with some t test statistic. Suppose we perform a one-tailed hypothesis test and end up with a t test statistic of -1.5 and degrees of freedom = 10.
ppf: percent point function (or inverse cumulative distribution function) ppf returns the value x of the variable that has a given cumulative distribution probability (cdf). Thus, given the cdf(x) of a x value, ppf returns the value x itself, therefore, operating as the inverse of cdf.
The docs page you linked has a link to the source code. Which even has a nicely formatted formula for the distribution in the comments (search for class t_gen
).
loc
and scale
are a way all the continuous distributions in scipy.stats
are parametrized: Basically, for a distribution f(x)
, specifying loc and scale means you get f(loc + x*scale)
(line 1208 in the source linked above).
>>> import scipy.stats as stats
>>> stats.t.pdf(2, 2)
0.06804138174397717
>>> stats.t.pdf(2, 2, loc=0, scale=1)
0.06804138174397717
>>> stats.t.pdf(2+42, 2, loc=42, scale=1)
0.06804138174397717
>>> stats.t.stats(9, moments='mvsk')
(array(0.0), array(1.2857142857142858), array(0.0), array(1.2))
>>> stats.t.stats(8, loc=1, moments='mvsk')
(array(1.0), array(1.3333333333333333), array(0.0), array(1.5))
>>> stats.t.interval(0.95, 4, loc=0)
(-2.7764451051977987, 2.7764451051977987)
>>> stats.t.interval(0.95, 4, loc=3)
(0.22355489480220125, 5.7764451051977987)
Yes, this is a little baffling at first sight :-).
Since the previous answer is not explicit, I made some research and just verified that:
loc is the mean.
scale is the standard error of the mean.
Such that: μ = M ± t(sM)
where μ is the t-interval, M is the mean, t is the t statistic, and sM = √(std^2/n) is the standard error of the mean.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With