Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating cumulative hypergeometric distribution

Tags:

r

probability

Suppose I have 100 marbles, and 8 of them are red. I draw 30 marbles, and I want to know what's the probability that at least five of the marbles are red. I am currently using http://stattrek.com/online-calculator/hypergeometric.aspx and I entered 100, 8, 30, and 5 for population size, number of success, sample size, and number of success in sample, respectively. So the probability I'm interested in is Cumulative Probability: $P(X \geq 5)$ which = 0.050 in this case. My question is, how do I calculate this in R?

I tried

> 1-phyper(5, 8, 92, 30, lower.tail = TRUE)
[1] 0.008503108

But this is very different from the previous answer.

like image 235
Adrian Avatar asked Mar 18 '26 20:03

Adrian


1 Answers

phyper(5, 8, 92, 30) gives the probability of drawing five or fewer red marbles.

1 - phyper(5, 8, 92, 30) thus returns the probability of getting six or more red marbles

Since you want the probability of getting five or more (i.e. more than 4) red marbles, you should use one of the following:

1 - phyper(4, 8, 92, 30)
[1] 0.05042297

phyper(4, 8, 92, 30, lower.tail=FALSE)
[1] 0.05042297
like image 117
Josh O'Brien Avatar answered Mar 21 '26 10:03

Josh O'Brien