Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to test for Homoscedasticity (having the same population variance) in Python?

I have the following data sets that I want to run a One Way ANOVA test on:

substr = [2.011,1.865,2.002,2.202,1.896,2.209,2.222,2.087,1.905,2.052,1.828,1.968,1.907,1.898,1.849,2.172,1.883,2.14,2.074,2.05,2.159,2.323,2.201,1.971,1.855,2.088,1.943,2.081,1.981,2.038,2.064,1.84,2.091,1.993,2.059,1.986,1.957,1.956,1.847,2.033,1.907,1.88,1.92,2.035,1.852,1.949,1.892,1.888,2.1,1.975,2.038,1.849,1.9,1.891,2.0,1.875,1.95,1.959,2.087,1.863,1.749,1.91,1.979,1.87,1.984,2.029,2.077,1.952,2.003,1.858,2.098,1.895,1.962,2.19,1.989,2.055,2.145,2.033,2.154,1.944,2.114,2.242,1.929,1.931,1.938,2.038,2.093,1.966,1.952,1.978,1.967,1.86,2.129,2.176,1.914,2.163,2.161,2.109,2.077,2.105]
column = [1.853,1.916,2.007,2.157,2.034,2.291,2.326,2.011,2.075,1.888,2.017,2.105,2.168,2.046,2.04,2.146,2.02,2.213,2.188,2.261,2.333,2.509,2.264,1.857,2.088,1.843,2.211,2.256,2.045,1.947,1.944,2.063,2.203,1.999,1.901,2.213,1.939,2.089,1.964,2.01,1.988,1.903,2.092,2.145,2.097,1.933,1.858,2.075,1.869,2.013,2.183,2.035,2.221,2.024,2.106,2.045,2.036,1.981,2.072,2.019,1.863,1.911,1.937,2.385,1.878,2.056,2.01,1.984,1.983,2.178,1.909,1.886,2.126,2.166,2.296,2.125,1.998,2.313,2.207,2.095,2.331,2.177,2.095,2.078,2.02,2.147,1.99,1.938,2.028,2.081,2.168,2.178,2.054,2.123,2.1,2.37,2.057,2.336,2.024,2.061]
regex = [1.51,1.544,1.771,1.791,1.72,1.925,1.635,1.56,1.671,1.642,1.636,1.747,1.564,1.558,1.649,1.716,1.798,1.868,1.781,1.794,1.895,1.757,1.706,1.492,1.768,1.734,1.774,1.796,1.812,1.734,1.698,1.832,1.812,1.605,1.63,1.672,1.599,1.56,1.646,1.67,1.832,1.633,1.745,1.626,1.689,1.756,1.472,1.678,1.506,1.595,1.705,1.659,1.734,1.741,1.825,1.584,1.606,1.656,1.547,1.832,1.727,1.502,1.717,1.686,1.684,1.669,1.698,1.676,1.638,1.703,1.635,1.704,1.716,1.779,1.859,1.679,1.626,1.71,1.771,1.829,1.82,1.816,1.77,1.744,1.681,1.791,1.756,1.678,1.835,1.77,1.646,1.742,1.736,1.66,1.708,1.874,1.975,1.775,1.697,1.613]

Two of the main ANOVA assumptions are :

  1. The three groups are normally distributed
  2. The three groups have a homogeneity of variance; meaning the population variances are equal

To test whether my groups are normally distributed, I can use scipy.stats.mstats.normaltest.

How do I test whether the three groups are homoscedastic in scipy or another python library?

like image 872
Matthew Moisen Avatar asked Mar 12 '23 20:03

Matthew Moisen


2 Answers

You can use scipy function scipy.stats.bartlett. According to docs:

Bartlett’s test tests the null hypothesis that all input samples are from populations with equal variances.

like image 136
Kush Patel Avatar answered Apr 30 '23 15:04

Kush Patel


A better approach than Bartlett's test it to use Levene's test.

scipy.stats.levene() returns a tuple where the first element is W , the test's statistic, and the second element is the p-value for the test.

like image 23
Marcelo Villa-Piñeros Avatar answered Apr 30 '23 13:04

Marcelo Villa-Piñeros