I know that I need mean and s.d to find the interval, however, what if the question is:
For a survey of 1,000 randomly chosen workers, 520 of them are female. Create a 95% confidence interval for the proportion of workers who are female based on the survey.
How do I find mean and s.d for that?
Divide the numbers you found in the table by the number of population members. In this example, there are 10,000 members, so the confidence interval is: 2.202 / 10,000 = 0.00022. 13.06 / 10,000 = 0.001306.
The tinterval command of R is a useful one for finding confidence intervals for the mean when the data are normally distributed with unknown variance. We illustrate the use of this command for the lizard tail length data. If we use the t.
The Wald interval is the most basic confidence interval for proportions. Wald interval relies a lot on normal approximation assumption of binomial distribution and there are no modifications or corrections that are applied.
Apparently a narrow confidence interval implies that there is a smaller chance of obtaining an observation within that interval, therefore, our accuracy is higher. Also a 95% confidence interval is narrower than a 99% confidence interval which is wider. The 99% confidence interval is more accurate than the 95%.
You can also use prop.test
from package stats
, or binom.test
prop.test(x, n, conf.level=0.95, correct = FALSE)
1-sample proportions test without continuity correction
data: x out of n, null probability 0.5
X-squared = 1.6, df = 1, p-value = 0.2059
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.4890177 0.5508292
sample estimates:
p
0.52
You may find interesting this article, where in Table 1 on page 861 are given different confidence intervals, for a single proportion, calculated using seven methods (for selected combinations of n and r). Using prop.test
you can get the results found in rows 3 and 4 of the table, while binom.test
returns what you see in row 5.
In this case, you have binomial distribution, so you will be calculating binomial proportion confidence interval.
In R, you can use binconf()
from package Hmisc
> binconf(x=520, n=1000)
PointEst Lower Upper
0.52 0.4890177 0.5508292
Or you can calculate it yourself:
> p <- 520/1000
> p + c(-qnorm(0.975),qnorm(0.975))*sqrt((1/1000)*p*(1-p))
[1] 0.4890345 0.5509655
Alternatively, use function propCI
from the prevalence
package, to get the five most commonly used binomial confidence intervals:
> library(prevalence)
> propCI(x = 520, n = 1000)
x n p method level lower upper
1 520 1000 0.52 agresti.coull 0.95 0.4890176 0.5508293
2 520 1000 0.52 exact 0.95 0.4885149 0.5513671
3 520 1000 0.52 jeffreys 0.95 0.4890147 0.5508698
4 520 1000 0.52 wald 0.95 0.4890351 0.5509649
5 520 1000 0.52 wilson 0.95 0.4890177 0.5508292
Another package: tolerance
will calculate confidence / tolerance ranges for a ton of typical distribution functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With