Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do a Tukey HSD test with the Anova command (car package)

Tags:

r

lm

anova

tukey

r-car

I'm dealing with an unbalanced design/sample and originally learned aov(). I know now that for my ANOVA tests I need to use the Type III Sum of Squares which involves using fitting using lm() rather than using aov().

The problem is getting post-hoc tests (specifically Tukey's HSD) using lm(). All the research I've done has said that using simint in the multcomp package would work, but now that it's updated that command seems to not be available. It also seems to rely upon going through aov() to calculate.

Essentially all of the Tukey HSD tests I've found for R assume that you use aov() for the comparison rather than lm(). To get the Type III Sum of Squares I need for the unbalanced design I have to use:

mod<-lm(Snavg~StudentEthnicity*StudentGender)

Anova(mod, type="III")

How do I use a Tukey HSD test with my mod using lm()? Or conversely, calculate my ANOVA using Type III and still be able to run a Tukey HSD test?

Thanks!

like image 800
leighadlr Avatar asked Oct 11 '11 21:10

leighadlr


People also ask

Is a Tukey test an ANOVA?

Tukey's Honest Significant Difference (HSD) test is a post hoc test commonly used to assess the significance of differences between pairs of group means. Tukey HSD is often a follow up to one-way ANOVA, when the F-test has revealed the existence of a significant difference between some of the tested groups.

What is ANOVA and Tukey post hoc test?

The Tukey Test (or Tukey procedure), also called Tukey's Honest Significant Difference test, is a post-hoc test based on the studentized range distribution. An ANOVA test can tell you if your results are significant overall, but it won't tell you exactly where those differences lie.


1 Answers

As an initial note, unless it's been changed, to get the correct results for type iii sum of squares, you need to set the contrast coding for the factor variables. This can be done inside the lm call or with options. The example below uses options.

I would be cautious about using HSD.test and similar functions with unbalanced designs unless the documentation addresses their use in these situations. The documentation for TukeyHSD mentions that it adjusts for "mildly unbalanced" designs. I don't know if HSD.test handles things differently. You'd have to check additional documentation for the package or the original reference cited for the function.

As a side note, enclosing the whole HSD.test function in parentheses will cause it to print the results. See example below.

In general, I would recommend using the flexible emmeans (née lsmeans) or multcomp packages for all your post-hoc comparison needs. emmeans is particularly useful for doing mean separations on interactions or for examining contrasts among treatments. [EDIT: Caveat that I am the author of these pages.]

With an unbalanced design, you may want to report the E.M. (or L.S.) means instead of the arithmetic means. See SAEPER: What are least square means?. [EDIT: Caveat that I am the author of this page.] Note in the example below that the marginal means reported by emmeans are different than those reported by HSD.test.

Also note that the "Tukey" in glht has nothing to do with Tukey HSD or Tukey-adjusted comparisons; it just sets up the contrasts for all pairwise tests, as the output says.

However, the adjust="tukey" in emmeans functions does mean to use Tukey-adjusted comparisons, as the output says.

The following example is partially adapted from ARCHBS: One-way Anova.

### EDIT: Some code changed to reflect changes to some functions
###  in the emmeans package

if(!require(car)){install.packages("car")}
library(car)
data(mtcars)
mtcars$cyl.f = factor(mtcars$cyl)
mtcars$carb.f = factor(mtcars$carb)

options(contrasts = c("contr.sum", "contr.poly"))

model = lm(mpg ~ cyl.f + carb.f, data=mtcars)

library(car)
Anova(model, type="III")

if(!require(agricolae)){install.packages("agricolae")}
library(agricolae)
(HSD.test(model, "cyl")$groups)

if(!require(emmeans)){install.packages("emmeans")}
library(emmeans)

marginal = emmeans(model,
                   ~ cyl.f)

pairs(marginal, adjust="tukey")

if(!require(multcomp)){install.packages("multcomp")}
library(multcomp)

cld(marginal, adjust="tukey", Letters=letters)


if(!require(multcomp)){install.packages("multcomp")}
library(multcomp)

mc = glht(model,
          mcp(cyl.f = "Tukey"))

summary(mc, test=adjusted("single-step"))

cld(mc)
like image 94
Sal Mangiafico Avatar answered Oct 04 '22 04:10

Sal Mangiafico