Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating adjusted p-values in Python

So, I've been spending some time looking for a way to get adjusted p-values (aka corrected p-values, q-values, FDR) in Python, but I haven't really found anything. There's the R function p.adjust, but I would like to stick to Python coding, if possible. Is there anything similar for Python?

If this is somehow a bad question, sorry in advance! I did search for answers first, but found none (except a Matlab version)... Any help is appreciated!

like image 692
erikfas Avatar asked Aug 07 '14 14:08

erikfas


People also ask

How do you calculate the adjusted p-value?

Following the Vladimir Cermak suggestion, manually perform the calculation using, adjusted p-value = p-value*(total number of hypotheses tested)/(rank of the p-value), or use R as suggested by Oliver Gutjahr p.

What is adjusted p-value in statistics?

Adjusted P value or significance levelOther Section In statistical inference, a probability value (namely P value) is directly or indirectly computed for each hypothesis and then compared with the pre-specified significance level α for determining this H0 should be rejected or not (3).

What is the difference between p-value and adjusted p-value?

Another way to look at the difference is that a p-value of 0.05 implies that 5% of all tests will result in false positives. An FDR adjusted p-value (or q-value) of 0.05 implies that 5% of significant tests will result in false positives. The latter will result in fewer false positives.

Why are p-values adjusted for multiple comparisons?

A p-value adjustment is necessary when one performs multiple comparisons or multiple testing in a more general sense: performing multiple tests of significance where only one significant result will lead to the rejection of an overall hypothesis.


2 Answers

It is available in statsmodels.

http://statsmodels.sourceforge.net/devel/stats.html#multiple-tests-and-multiple-comparison-procedures

http://statsmodels.sourceforge.net/devel/generated/statsmodels.sandbox.stats.multicomp.multipletests.html

and some explanations, examples and Monte Carlo http://jpktd.blogspot.com/2013/04/multiple-testing-p-value-corrections-in.html

like image 79
Josef Avatar answered Sep 19 '22 12:09

Josef


According to the biostathandbook, the BH is easy to compute.

def fdr(p_vals):      from scipy.stats import rankdata     ranked_p_values = rankdata(p_vals)     fdr = p_vals * len(p_vals) / ranked_p_values     fdr[fdr > 1] = 1      return fdr 
like image 42
The Unfun Cat Avatar answered Sep 22 '22 12:09

The Unfun Cat