How do I do an F-test to check if the variance is equivalent in two vectors in Python? For example if I have <pre class="prettyprint"><code>a = [1,2,1,2,1,2,1,2,1,2] b = [1,3,-1,2,1,5,-1,6,-1,2] </code></pre> is there something similar to <pre class="prettyprint"><code>scipy.stats.ttest_ind(a, b) </code></pre> I found <pre class="prettyprint"><code>sp.stats.f(a, b) </code></pre> But it appears to be something different to an F-test

For anyone who came here searching for an ANOVA F-test or to compare between models for feature selection <ul> <li> <code>sklearn.feature_selection.f_classif</code> does ANOVA tests, and</li> <li> <code>sklearn.feature_selection.f_regression</code> does sequential testing of regressions</li> </ul>

How do I do a F-test in python

Tags:

python

statistics

How do I do an F-test to check if the variance is equivalent in two vectors in Python?

For example if I have

a = [1,2,1,2,1,2,1,2,1,2] b = [1,3,-1,2,1,5,-1,6,-1,2]

is there something similar to

scipy.stats.ttest_ind(a, b)

I found

sp.stats.f(a, b)

But it appears to be something different to an F-test

415

asked Feb 01 '14 04:02

DrewH

2 Answers

The test statistic F test for equal variances is simply:

F = Var(X) / Var(Y)

Where F is distributed as df1 = len(X) - 1, df2 = len(Y) - 1

scipy.stats.f which you mentioned in your question has a CDF method. This means you can generate a p-value for the given statistic and test whether that p-value is greater than your chosen alpha level.

Thus:

alpha = 0.05 #Or whatever you want your alpha to be. p_value = scipy.stats.f.cdf(F, df1, df2) if p_value > alpha:     # Reject the null hypothesis that Var(X) == Var(Y)

Note that the F-test is extremely sensitive to non-normality of X and Y, so you're probably better off doing a more robust test such as Levene's test or Bartlett's test unless you're reasonably sure that X and Y are distributed normally. These tests can be found in the scipy api:

Bartlett's test
Levene's test

153

answered Oct 14 '22 10:10

Joel Cornett

For anyone who came here searching for an ANOVA F-test or to compare between models for feature selection

sklearn.feature_selection.f_classif does ANOVA tests, and
sklearn.feature_selection.f_regression does sequential testing of regressions

answered Oct 14 '22 10:10

slushy

Related questions
                            
                                pandas dataframe create new columns and fill with calculated values from same df
                            
                                How do I import from a file in the current directory in Python 3?
                            
                                rounding errors in Python floor division
                            
                                How can I type-check variables in Python?
                            
                                Does a File Object Automatically Close when its Reference Count Hits Zero?
                            
                                AND/OR in Python? [duplicate]
                            
                                file.tell() inconsistency
                            
                                How do I receive Github Webhooks in Python
                            
                                What does [:, :] mean on NumPy arrays
                            
                                How can I read tar.gz file using pandas read_csv with gzip compression option?
                            
                                Proper way in Python to raise errors while setting variables
                            
                                Make python code continue after exception
                            
                                Requests — how to tell if you're getting a success message?
                            
                                drawing a line on an image with PIL
                            
                                LEFT JOIN Django ORM
                            
                                Sorting list of lists by the first element of each sub-list
                            
                                Equivalent of Python string.format in Go?
                            
                                Computing cross-correlation function?
                            
                                sqlalchemy: 'InstrumentedList' object has no attribute 'filter'
                            
                                How can you bundle all your python code into a single zip file?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With