Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nested ANOVA unique factor levels

Tags:

r

anova

I'm running a nested ANOVA with the following setup: 2 areas, one is reference, one is exposure (column named CI = Control/Impact). Two time periods (before and after impact, column named BA), with 1 year in the before period and 3 years in the after period. The years are nested.

My question is: if I use the original years (in column Time2 in the toy dataset), I get one result. If I rename the years, so that they're just 1 for Before and 1-3 for After, I get a different result.

Questions:

  1. Since the years were unique, should the nesting not account for the correct data structure?
  2. My results are identical for the two models using type 1 or type 2 SS. Why do they not differ between types of SS? I would have expected BA (but not CI) to change.
  3. Results for Type 3 SS using the unique names are the same as for types 1 and 2 SS. But using the renamed values, I get a different result. What is going on?

toy dataset:

toy <- structure(list(BA = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("A", "B"), class = "factor"), Time = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", "2", "3"), class = "factor"), 
Time2 = structure(c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), .Label = c("11", "12", "13", "15", "16", "17"), class = "factor"), 
Lake = c("Area 1", "Area 1", "Area 1", "Area 1", "Area 1", 
"Area 2", "Area 2", "Area 2", "Area 2", "Area 2", "Area 1", 
"Area 1", "Area 1", "Area 1", "Area 1", "Area 2", "Area 2", 
"Area 2", "Area 2", "Area 2", "Area 1", "Area 1", "Area 1", 
"Area 1", "Area 1", "Area 2", "Area 2", "Area 2", "Area 2", 
"Area 2", "Area 1", "Area 1", "Area 1", "Area 1", "Area 1", 
"Area 2", "Area 2", "Area 2", "Area 2", "Area 2"), CI = structure(c(2L, 
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("C", "I"), class = "factor"), 
Response = c(78.3, 75.3, 69.4, 75.1, 71.1, 49.7, 61, 59.6, 
35.3, 26.5, 80.9, 81.4, 67.6, 73.6, 73, 46.4, 73.6, 67.1, 
34, 45.5, 86.6, 78, 68.2, 76.8, 69.6, 52.1, 61.9, 50.8, 39.2, 
49.6, 72, 74, 71, 68, 58, 40, 41, 34, 54, 61)), .Names = c("BA", 
"Time", "Time2", "Lake", "CI", "Response"), row.names = c(NA, 
40L), class = "data.frame")

analysis using type 1 SS:

mod <- lm(Response ~ BA + CI + BA*CI + BA/Time + BA/Time*CI, data = toy)
mod1 <- lm(Response ~ BA + CI + BA*CI + BA/Time2 + BA/Time2*CI, data = toy)
# results are the same
anova(mod)
anova(mod1)

now try with type 2

library(car)
options(contrasts=c("contr.sum", "contr.poly"))
mod <- lm(Response ~ BA + CI + BA*CI + BA/Time + BA/Time*CI, data = toy)
mod1 <- lm(Response ~ BA + CI + BA*CI + BA/Time2 + BA/Time2*CI, data = toy)
Anova(mod, type = "II", singular.ok = TRUE)
Anova(mod1, type = "II", singular.ok = TRUE)

and type 3

Anova(mod, type = "III", singular.ok = TRUE)
Anova(mod1, type = "III", singular.ok = TRUE)
like image 947
user2602640 Avatar asked Nov 24 '17 13:11

user2602640


People also ask

What is a nested factor in ANOVA?

In the two-way ANOVA design, the two factors are known as factorial (i.e., there was every combination of every level of each factor). Other experimental designs feature factors that are termed nested. This is when each level of one of the factors is unique to only one level of the other factor.

How do you determine if a factor is nested?

A factor B is nested in factor A if each level of B occurs in conjunction with only one level of A. (In other words, if there is a completely different set of levels of B for every level of A.)

What is the difference between two-way Anova and nested ANOVA?

The difference is that in a two-way anova, the values of each nominal variable are found in all combinations with the other nominal variable; in a nested anova, each value of one nominal variable (the subgroups) is found in combination with only one value of the other nominal variable (the groups).

What is a nested factor?

What is a nested factor? Two factors are nested when the levels of one factor are similar but not identical, and each occurs in combination with different levels of another factor. For example, if Machine 1 is in Galveston and Machine 2 is in Baton Rouge, each machine will have different operators.


1 Answers

Warning at the outset, bits of this probably aren't exactly right - I'm admittedly rusty on ANOVA at this point.

In a Type-III SS analysis we say that the main effects are qualified by the interaction. In short, that means that in the presence of higher order interactions, lower order interactions and main effects are less interpretable. Tradition pushes us towards Type-III analyses, but they really are a pain for this reason. Anyway...

Let's take a quick peek at your recode. The effect of your recode was that Time2 went from having four distinct values to having three distinct values in Time. You may hold onto some interpretability because the combination of BA at level B is unique to the time values that were previously 13, but are now 1.

Let's go back to your data. Now, BA:Time2 carries all of the same information as BA:CI. How does this look in your Time2 ANOVA results...

> Anova(mod1, type = "III", singular.ok = TRUE)
Anova Table (Type III tests)

Response: Response
            Sum Sq Df  F value    Pr(>F)    
(Intercept)  73945  1 712.5963 < 2.2e-16 ***
BA             246  1   2.3678    0.1337    
CI            2484  1  23.9401 2.713e-05 ***
BA:CI            0  1   0.0046    0.9462    
BA:Time2        95  2   0.4570    0.6372    
BA:CI:Time2     37  2   0.1797    0.8364    
Residuals     3321 32                 

... SS for BA:CI is, as expected (more or less) 0. Contrast this with the Time model...

> Anova(mod, type = "III", singular.ok = TRUE)
Anova Table (Type III tests)

Response: Response
            Sum Sq Df   F value    Pr(>F)    
(Intercept) 107772  1 1038.5835 < 2.2e-16 ***
BA             209  1    2.0099    0.1659    
CI            4220  1   40.6655 3.661e-07 ***
BA:CI            9  1    0.0907    0.7653    
BA:Time         95  2    0.4570    0.6372    
BA:CI:Time      37  2    0.1797    0.8364    
Residuals     3321 32                        

BA:CI got to hold onto a bit of variance... and the rest of the model seemed to have done better as well.

My sense is that you have a poorly specified data set for ANOVA under all coding schemes. Level B of AB is confounded with one level of your time group under both ways of coding things. Also take special note of the comments in package:car for Anova, particularly with regard to the singular.ok arguement. In short it says 'defaults to TRUE for type-II tests, and FALSE for type-III tests (where the tests for models with aliased coefficients will not be straightforwardly interpretable)' ... and here you are with seemingly aliased coefficients.

... nevermind that Fox's description of what he has done with Type II and Type III tests is... challenging to interpret. This is a reminder for me as to why I always used package:ez for ezANOVA() back in the day when I had to tolerate non-type I testing.

like image 110
russellpierce Avatar answered Sep 29 '22 14:09

russellpierce