Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confidence interval for binomial data in R?

I know that I need mean and s.d to find the interval, however, what if the question is:

For a survey of 1,000 randomly chosen workers, 520 of them are female. Create a 95% confidence interval for the proportion of workers who are female based on the survey.

How do I find mean and s.d for that?

like image 380
Pig Avatar asked Feb 12 '14 05:02

Pig


People also ask

How do you find the confidence interval for a binomial distribution?

Divide the numbers you found in the table by the number of population members. In this example, there are 10,000 members, so the confidence interval is: 2.202 / 10,000 = 0.00022. 13.06 / 10,000 = 0.001306.

Can you calculate confidence intervals in R?

The tinterval command of R is a useful one for finding confidence intervals for the mean when the data are normally distributed with unknown variance. We illustrate the use of this command for the lizard tail length data. If we use the t.

What is Wald confidence interval?

The Wald interval is the most basic confidence interval for proportions. Wald interval relies a lot on normal approximation assumption of binomial distribution and there are no modifications or corrections that are applied.

What does a narrow 95% CI mean?

Apparently a narrow confidence interval implies that there is a smaller chance of obtaining an observation within that interval, therefore, our accuracy is higher. Also a 95% confidence interval is narrower than a 99% confidence interval which is wider. The 99% confidence interval is more accurate than the 95%.


4 Answers

You can also use prop.test from package stats, or binom.test

prop.test(x, n, conf.level=0.95, correct = FALSE)

        1-sample proportions test without continuity correction

data:  x out of n, null probability 0.5
X-squared = 1.6, df = 1, p-value = 0.2059
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
 0.4890177 0.5508292
sample estimates:
   p 
0.52 

You may find interesting this article, where in Table 1 on page 861 are given different confidence intervals, for a single proportion, calculated using seven methods (for selected combinations of n and r). Using prop.test you can get the results found in rows 3 and 4 of the table, while binom.test returns what you see in row 5.

like image 141
George Dontas Avatar answered Oct 05 '22 23:10

George Dontas


In this case, you have binomial distribution, so you will be calculating binomial proportion confidence interval.

In R, you can use binconf() from package Hmisc

> binconf(x=520, n=1000)
 PointEst     Lower     Upper
     0.52 0.4890177 0.5508292

Or you can calculate it yourself:

> p <- 520/1000
> p + c(-qnorm(0.975),qnorm(0.975))*sqrt((1/1000)*p*(1-p))
[1] 0.4890345 0.5509655
like image 44
Zbynek Avatar answered Oct 05 '22 21:10

Zbynek


Alternatively, use function propCI from the prevalence package, to get the five most commonly used binomial confidence intervals:

> library(prevalence)
> propCI(x = 520, n = 1000)
    x    n    p        method level     lower     upper
1 520 1000 0.52 agresti.coull  0.95 0.4890176 0.5508293
2 520 1000 0.52         exact  0.95 0.4885149 0.5513671
3 520 1000 0.52      jeffreys  0.95 0.4890147 0.5508698
4 520 1000 0.52          wald  0.95 0.4890351 0.5509649
5 520 1000 0.52        wilson  0.95 0.4890177 0.5508292
like image 28
Brecht Devleesschauwer Avatar answered Oct 05 '22 22:10

Brecht Devleesschauwer


Another package: tolerance will calculate confidence / tolerance ranges for a ton of typical distribution functions.

like image 45
Carl Witthoft Avatar answered Oct 05 '22 22:10

Carl Witthoft