I'm running a nested ANOVA with the following setup: 2 areas, one is reference, one is exposure (column named CI = Control/Impact). Two time periods (before and after impact, column named BA), with 1 year in the before period and 3 years in the after period. The years are nested.
My question is: if I use the original years (in column Time2 in the toy dataset), I get one result. If I rename the years, so that they're just 1 for Before and 1-3 for After, I get a different result.
Questions:
toy dataset:
toy <- structure(list(BA = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("A", "B"), class = "factor"), Time = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", "2", "3"), class = "factor"),
Time2 = structure(c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), .Label = c("11", "12", "13", "15", "16", "17"), class = "factor"),
Lake = c("Area 1", "Area 1", "Area 1", "Area 1", "Area 1",
"Area 2", "Area 2", "Area 2", "Area 2", "Area 2", "Area 1",
"Area 1", "Area 1", "Area 1", "Area 1", "Area 2", "Area 2",
"Area 2", "Area 2", "Area 2", "Area 1", "Area 1", "Area 1",
"Area 1", "Area 1", "Area 2", "Area 2", "Area 2", "Area 2",
"Area 2", "Area 1", "Area 1", "Area 1", "Area 1", "Area 1",
"Area 2", "Area 2", "Area 2", "Area 2", "Area 2"), CI = structure(c(2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("C", "I"), class = "factor"),
Response = c(78.3, 75.3, 69.4, 75.1, 71.1, 49.7, 61, 59.6,
35.3, 26.5, 80.9, 81.4, 67.6, 73.6, 73, 46.4, 73.6, 67.1,
34, 45.5, 86.6, 78, 68.2, 76.8, 69.6, 52.1, 61.9, 50.8, 39.2,
49.6, 72, 74, 71, 68, 58, 40, 41, 34, 54, 61)), .Names = c("BA",
"Time", "Time2", "Lake", "CI", "Response"), row.names = c(NA,
40L), class = "data.frame")
analysis using type 1 SS:
mod <- lm(Response ~ BA + CI + BA*CI + BA/Time + BA/Time*CI, data = toy)
mod1 <- lm(Response ~ BA + CI + BA*CI + BA/Time2 + BA/Time2*CI, data = toy)
# results are the same
anova(mod)
anova(mod1)
now try with type 2
library(car)
options(contrasts=c("contr.sum", "contr.poly"))
mod <- lm(Response ~ BA + CI + BA*CI + BA/Time + BA/Time*CI, data = toy)
mod1 <- lm(Response ~ BA + CI + BA*CI + BA/Time2 + BA/Time2*CI, data = toy)
Anova(mod, type = "II", singular.ok = TRUE)
Anova(mod1, type = "II", singular.ok = TRUE)
and type 3
Anova(mod, type = "III", singular.ok = TRUE)
Anova(mod1, type = "III", singular.ok = TRUE)
In the two-way ANOVA design, the two factors are known as factorial (i.e., there was every combination of every level of each factor). Other experimental designs feature factors that are termed nested. This is when each level of one of the factors is unique to only one level of the other factor.
A factor B is nested in factor A if each level of B occurs in conjunction with only one level of A. (In other words, if there is a completely different set of levels of B for every level of A.)
The difference is that in a two-way anova, the values of each nominal variable are found in all combinations with the other nominal variable; in a nested anova, each value of one nominal variable (the subgroups) is found in combination with only one value of the other nominal variable (the groups).
What is a nested factor? Two factors are nested when the levels of one factor are similar but not identical, and each occurs in combination with different levels of another factor. For example, if Machine 1 is in Galveston and Machine 2 is in Baton Rouge, each machine will have different operators.
Warning at the outset, bits of this probably aren't exactly right - I'm admittedly rusty on ANOVA at this point.
In a Type-III SS analysis we say that the main effects are qualified by the interaction. In short, that means that in the presence of higher order interactions, lower order interactions and main effects are less interpretable. Tradition pushes us towards Type-III analyses, but they really are a pain for this reason. Anyway...
Let's take a quick peek at your recode. The effect of your recode was that Time2
went from having four distinct values to having three distinct values in Time
. You may hold onto some interpretability because the combination of BA
at level B is unique to the time values that were previously 13, but are now 1.
Let's go back to your data. Now, BA:Time2 carries all of the same information as BA:CI. How does this look in your Time2 ANOVA results...
> Anova(mod1, type = "III", singular.ok = TRUE)
Anova Table (Type III tests)
Response: Response
Sum Sq Df F value Pr(>F)
(Intercept) 73945 1 712.5963 < 2.2e-16 ***
BA 246 1 2.3678 0.1337
CI 2484 1 23.9401 2.713e-05 ***
BA:CI 0 1 0.0046 0.9462
BA:Time2 95 2 0.4570 0.6372
BA:CI:Time2 37 2 0.1797 0.8364
Residuals 3321 32
... SS for BA:CI is, as expected (more or less) 0. Contrast this with the Time model...
> Anova(mod, type = "III", singular.ok = TRUE)
Anova Table (Type III tests)
Response: Response
Sum Sq Df F value Pr(>F)
(Intercept) 107772 1 1038.5835 < 2.2e-16 ***
BA 209 1 2.0099 0.1659
CI 4220 1 40.6655 3.661e-07 ***
BA:CI 9 1 0.0907 0.7653
BA:Time 95 2 0.4570 0.6372
BA:CI:Time 37 2 0.1797 0.8364
Residuals 3321 32
BA:CI got to hold onto a bit of variance... and the rest of the model seemed to have done better as well.
My sense is that you have a poorly specified data set for ANOVA under all coding schemes. Level B of AB is confounded with one level of your time group under both ways of coding things. Also take special note of the comments in package:car for Anova, particularly with regard to the singular.ok arguement. In short it says 'defaults to TRUE for type-II tests, and FALSE for type-III tests (where the tests for models with aliased coefficients will not be straightforwardly interpretable)' ... and here you are with seemingly aliased coefficients.
... nevermind that Fox's description of what he has done with Type II and Type III tests is... challenging to interpret. This is a reminder for me as to why I always used package:ez for ezANOVA() back in the day when I had to tolerate non-type I testing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With