I am looking for a quick way to get the t-test confidence interval in Python for the difference between means. Similar to this in R:
X1 <- rnorm(n = 10, mean = 50, sd = 10) X2 <- rnorm(n = 200, mean = 35, sd = 14) # the scenario is similar to my data t_res <- t.test(X1, X2, alternative = 'two.sided', var.equal = FALSE) t_res
Out:
Welch Two Sample t-test data: X1 and X2 t = 1.6585, df = 10.036, p-value = 0.1281 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -2.539749 17.355816 sample estimates: mean of x mean of y 43.20514 35.79711
Next:
>> print(c(t_res$conf.int[1], t_res$conf.int[2])) [1] -2.539749 17.355816
I am not really finding anything similar in either statsmodels or scipy, which is strange, considering the importance of significance intervals in hypothesis testing (and how much criticism the practice of reporting only the p-values recently got).
Create a new sample based on our dataset, with replacement and with the same number of points. Calculate the mean value and store it in an array or list. Repeat the process many times (e.g. 1000) On the list of the mean values, calculate 2.5th percentile and 97.5th percentile (if you want a 95% confidence interval)
The confidence interval for the difference in means provides an estimate of the absolute difference in means of the outcome variable of interest between the comparison groups. It is often of interest to make a judgment as to whether there is a statistically meaningful difference between comparison groups.
Here how to use StatsModels' CompareMeans
to calculate the confidence interval for the difference between means:
import numpy as np, statsmodels.stats.api as sms X1, X2 = np.arange(10,21), np.arange(20,26.5,.5) cm = sms.CompareMeans(sms.DescrStatsW(X1), sms.DescrStatsW(X2)) print cm.tconfint_diff(usevar='unequal')
Output is
(-10.414599391793885, -5.5854006082061138)
and matches R:
> X1 <- seq(10,20) > X2 <- seq(20,26,.5) > t.test(X1, X2) Welch Two Sample t-test data: X1 and X2 t = -7.0391, df = 15.58, p-value = 3.247e-06 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -10.414599 -5.585401 sample estimates: mean of x mean of y 15 23
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With