I've a question about the hypergeometric test.
I've data like this :
pop size : 5260
sample size : 131
Number of items in the pop that are classified as successes : 1998
Number of items in the sample that are classified as successes : 62
To compute a hypergeometric test, is that correct?
phyper(62, 1998, 5260, 131)
When do we use the hypergeometric distribution? The hypergeometric distribution is a discrete probability distribution. It is used when you want to determine the probability of obtaining a certain number of successes without replacement from a specific sample size.
In a test for over-representation of successes in the sample, the hypergeometric p-value is calculated as the probability of randomly drawing or more successes from the population in total draws. In a test for under-representation, the p-value is the probability of randomly drawing. or fewer successes.
Almost correct. If you look at ?phyper
:
phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE) x, q vector of quantiles representing the number of white balls drawn without replacement from an urn which contains both black and white balls. m the number of white balls in the urn. n the number of black balls in the urn. k the number of balls drawn from the urn.
So using your data:
phyper(62,1998,5260-1998,131) [1] 0.989247
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With