Chi squared test in Python

Tags:

I'd like to run a chi-squared test in Python. I've created code to do this, but I don't know if what I'm doing is right, because the scipy docs are quite sparse.

Background first: I have two groups of users. My null hypothesis is that there is no significant difference in whether people in either group are more likely to use desktop, mobile, or tablet.

These are the observed frequencies in the two groups:

Click to copy

[[u'desktop', 14452], [u'mobile', 4073], [u'tablet', 4287]]
[[u'desktop', 30864], [u'mobile', 11439], [u'tablet', 9887]]

Here is my code using scipy.stats.chi2_contingency:

Click to copy

obs = np.array([[14452, 4073, 4287], [30864, 11439, 9887]])
chi2, p, dof, expected = stats.chi2_contingency(obs)
print p

This gives me a p-value of 2.02258737401e-38, which clearly is significant.

My question is: does this code look valid? In particular, I'm not sure whether I should be using scipy.stats.chi2_contingency or scipy.stats.chisquare, given the data I have.

864

asked Aug 05 '14 12:08

Richard

2 Answers

I can't comment too much on the use of the function. However, the issue at hand may be statistical in nature. The very small p-value you are seeing is most likely a result of your data containing large frequencies ( in the order of ten thousand). When sample sizes are too large, any differences will become significant - hence the small p-value. The tests you are using are very sensitive to sample size. See here for more details.

answered Sep 30 '22 15:09

Luca Terzio Pontiggia

You are using chi2_contingency correctly. If you feel uncertain about the appropriate use of a chi-squared test or how to interpret its result (i.e. your question is about statistical testing rather than coding), consider asking it over at the "CrossValidated" site: https://stats.stackexchange.com/

answered Sep 30 '22 17:09

Warren Weckesser

Related questions
                            
                                using a `tf.Tensor` as a Python `bool` is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function
                            
                                Python equivalent to Java's JNLP Web Start?
                            
                                Detect English verb tenses using NLTK
                            
                                In setup.py or pip requirements file, how to control order of installing package dependencies?
                            
                                How to adapt the Singleton pattern? (Deprecation warning)
                            
                                Python: inconsistence in the way you define the function __setattr__?
                            
                                Get display count and resolution for each display in Python without xrandr
                            
                                Python subprocess call returns "command not found", Terminal executes correctly
                            
                                How to set NetworkX edge labels offset? (to avoid label overlap)
                            
                                Select data at a particular level from a MultiIndex
                            
                                OpenCV imread hanging when called from a web request
                            
                                How to test database connectivity in python?
                            
                                Connect to SMTP (SSL or TLS) using Python
                            
                                True=False assignment in Python 2.x [duplicate]
                            
                                How to find the path to a SSL cert file?
                            
                                How to terminate multiprocessing Pool processes?
                            
                                Mocking Oauth providers while testing
                            
                                Find subset with K elements that are closest to eachother
                            
                                how to convert a bs4.element.ResultSet to strings? Python
                            
                                Why does a function that returns itself max out recursion in python 3

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Chi squared test in Python

Tags:

python

numpy

scipy

chi-squared

Richard

People also ask

2 Answers

Luca Terzio Pontiggia

Warren Weckesser

Recent Activity

Donate For Us