Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Saving statmodels Tukey hsd into a Python pandas dataframe

I am looking for a way to save the results to save the results of the Tukeyhsd into a pandas dataframe. see below:

import matplotlib.pyplot as plt
import statsmodels.formula.api as smf
import statsmodels.stats.multicomp as multi 

 mcDate = multi.MultiComparison(df['Glucose'], df['Date'])
 Results = mcDate.tukeyhsd()
  print(Results)

    Multiple Comparison of Means - Tukey HSD,FWER=0.05
=============================================
group1 group2 meandiff  lower   upper  reject
---------------------------------------------
  A      B     20.35    7.388   33.312  True 
  A      C     -3.85   -16.812  9.112  False 
  B      C     -24.2   -37.162 -11.238  True 
---------------------------------------------
like image 468
Pancani Avatar asked Dec 10 '22 14:12

Pancani


1 Answers

I do not have access to your data, so I can't replicate the result. I used randomised data instead, just to show that this works. All you need to add to your code is the pandas import, and the last line creating the data frame.

import matplotlib.pyplot as plt
import statsmodels.formula.api as smf
import statsmodels.stats.multicomp as multi
import pandas as pd
import numpy as np

# Random Data.
np.random.seed(0)
x = np.random.choice(['A','B','C'], 50)
y = np.random.rand(50)

# DataFrame.
mcDate = multi.MultiComparison(y,x)
Results = mcDate.tukeyhsd()
print(Results)

Produces the following table:

============================================
group1 group2 meandiff  lower  upper  reject
--------------------------------------------
  A      B     0.1506   -0.07  0.3712 False 
  A      C     0.1105  -0.1278 0.3487 False 
  B      C    -0.0401  -0.2865 0.2063 False 
--------------------------------------------

And, this is how you get the data frame:

df = pd.DataFrame(data=Results._results_table.data[1:], columns=Results._results_table.data[0])

print(df)

group1 group2  meandiff   lower   upper  reject
0      A      B    0.1506 -0.0700  0.3712   False
1      A      C    0.1105 -0.1278  0.3487   False
2      B      C   -0.0401 -0.2865  0.2063   False

I struggled with this for a while myself, and eventually found the solution by reviewing methods for the object, like this:

dir(Results)
like image 61
vander Avatar answered Jan 11 '23 23:01

vander