Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fisher's exact test for bigger than 2 by 2 contingency table

Hi scipy stats has a implementation of Fisher's exact test but it is only for 2 by 2 contingency tables. I want to do the test on bigger than 2 by 2 tables. (5x2 ,5x3) I know there is fisher.test in R which can do the job but I want to do it in my python code

Anybody knows an python implementation of Fisher's exact test that can work on bigger tables?

Also I am not sure if it is ok to do Fisher's exact test on bigger than 2 by 2 tables.

Thanks

like image 808
svural Avatar asked Aug 18 '14 16:08

svural


People also ask

Can Fisher's exact test be used for more than 2x2?

The only problem with applying Fisher's exact test to tables larger than 2x2 is that the calculations become much more difficult to do.

Can Fisher's exact test be used for more than 2 groups?

Fisher's Exact Test is a statistical test used to determine if the proportions of categories in two group variables significantly differ from each other. To use this test, you should have two group variables with two or more options and you should have fewer than 10 values per cell. See more below.

Is Fishers exact only for 2x2?

The primary difference between the two is that Fisher's Exact Test is used ONLY when one of the four cells of a 2x2 table has less than five observations.

Can you use Fisher's exact test for large samples?

Fisher's exact test is practically applied only in analysis of small samples but actually it is valid for all sample sizes. While the chi-squared test relies on an approximation, Fisher's exact test is one of exact tests.


1 Answers

Yes, it is ok to do a Fisher's exact test on tables bigger than 2x2.

There currently aren't any clean, widely tested solutions out there in python. One solution would be to use rpy2 and call the R function from python:

import numpy as np
import rpy2.robjects.numpy2ri
from rpy2.robjects.packages import importr
rpy2.robjects.numpy2ri.activate()

stats = importr('stats')
m = np.array([[4,4],[4,5],[10,6]])
res = stats.fisher_test(m)
print 'p-value: {}'.format(res[0][0])
>> p-value: 0.668165917041

Another solution would be to dig into the C code that the R implementation uses and to call that code directly. Here is a link to someone's github project where they went back to the original fortran implementation and call that from python.

like image 82
benbo Avatar answered Oct 10 '22 00:10

benbo