Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python scipy chisquare returns different values than R chisquare

I am trying to use scipy.stats.chisquare. I have built a toy example:

In [1]: import scipy.stats as sps

In [2]: import numpy as np

In [3]: sps.chisquare(np.array([38,27,23,17,11,4]), np.array([98, 100, 80, 85,60,23]))
Out[11]: (240.74951271813072, 5.302429887719704e-50)

The same example in R returns:

> chisq.test(matrix(c(38,27,23,17,11,4,98,100,80,85,60,23), ncol=2))

Pearson's Chi-squared test

data:  matrix(c(38, 27, 23, 17, 11, 4, 98, 100, 80, 85, 60, 23), ncol = 2)
X-squared = 7.0762, df = 5, p-value = 0.215

What am I doing wrong?

Thanks

like image 827
gc5 Avatar asked Dec 10 '13 10:12

gc5


1 Answers

For this chisq.test call python equivalent is chi2_contingency:

This function computes the chi-square statistic and p-value for the hypothesis test of independence of the observed frequencies in the contingency table observed.

>>> arr = np.array([38,27,23,17,11,4,98,100,80,85,60,23]).reshape(2,-1)
>>> arr
array([[ 38,  27,  23,  17,  11,   4],
       [ 98, 100,  80,  85,  60,  23]])
>>> chi2, p, dof, expected = scipy.stats.chi2_contingency(arr)
>>> chi2, p, dof
(7.0762165124844367, 0.21503342516989818, 5)
like image 52
alko Avatar answered Nov 04 '22 07:11

alko