Nested ANOVA unique factor levels

Tags:

I'm running a nested ANOVA with the following setup: 2 areas, one is reference, one is exposure (column named CI = Control/Impact). Two time periods (before and after impact, column named BA), with 1 year in the before period and 3 years in the after period. The years are nested.

My question is: if I use the original years (in column Time2 in the toy dataset), I get one result. If I rename the years, so that they're just 1 for Before and 1-3 for After, I get a different result.

Questions:

Since the years were unique, should the nesting not account for the correct data structure?
My results are identical for the two models using type 1 or type 2 SS. Why do they not differ between types of SS? I would have expected BA (but not CI) to change.
Results for Type 3 SS using the unique names are the same as for types 1 and 2 SS. But using the renamed values, I get a different result. What is going on?

toy dataset:

toy <- structure(list(BA = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("A", "B"), class = "factor"), Time = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", "2", "3"), class = "factor"), 
Time2 = structure(c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L
), .Label = c("11", "12", "13", "15", "16", "17"), class = "factor"), 
Lake = c("Area 1", "Area 1", "Area 1", "Area 1", "Area 1", 
"Area 2", "Area 2", "Area 2", "Area 2", "Area 2", "Area 1", 
"Area 1", "Area 1", "Area 1", "Area 1", "Area 2", "Area 2", 
"Area 2", "Area 2", "Area 2", "Area 1", "Area 1", "Area 1", 
"Area 1", "Area 1", "Area 2", "Area 2", "Area 2", "Area 2", 
"Area 2", "Area 1", "Area 1", "Area 1", "Area 1", "Area 1", 
"Area 2", "Area 2", "Area 2", "Area 2", "Area 2"), CI = structure(c(2L, 
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("C", "I"), class = "factor"), 
Response = c(78.3, 75.3, 69.4, 75.1, 71.1, 49.7, 61, 59.6, 
35.3, 26.5, 80.9, 81.4, 67.6, 73.6, 73, 46.4, 73.6, 67.1, 
34, 45.5, 86.6, 78, 68.2, 76.8, 69.6, 52.1, 61.9, 50.8, 39.2, 
49.6, 72, 74, 71, 68, 58, 40, 41, 34, 54, 61)), .Names = c("BA", 
"Time", "Time2", "Lake", "CI", "Response"), row.names = c(NA, 
40L), class = "data.frame")

analysis using type 1 SS:

mod <- lm(Response ~ BA + CI + BA*CI + BA/Time + BA/Time*CI, data = toy)
mod1 <- lm(Response ~ BA + CI + BA*CI + BA/Time2 + BA/Time2*CI, data = toy)
# results are the same
anova(mod)
anova(mod1)

now try with type 2

library(car)
options(contrasts=c("contr.sum", "contr.poly"))
mod <- lm(Response ~ BA + CI + BA*CI + BA/Time + BA/Time*CI, data = toy)
mod1 <- lm(Response ~ BA + CI + BA*CI + BA/Time2 + BA/Time2*CI, data = toy)
Anova(mod, type = "II", singular.ok = TRUE)
Anova(mod1, type = "II", singular.ok = TRUE)

and type 3

Anova(mod, type = "III", singular.ok = TRUE)
Anova(mod1, type = "III", singular.ok = TRUE)

947

asked Nov 24 '17 13:11

user2602640

1 Answers

Warning at the outset, bits of this probably aren't exactly right - I'm admittedly rusty on ANOVA at this point.

In a Type-III SS analysis we say that the main effects are qualified by the interaction. In short, that means that in the presence of higher order interactions, lower order interactions and main effects are less interpretable. Tradition pushes us towards Type-III analyses, but they really are a pain for this reason. Anyway...

Let's take a quick peek at your recode. The effect of your recode was that Time2 went from having four distinct values to having three distinct values in Time. You may hold onto some interpretability because the combination of BA at level B is unique to the time values that were previously 13, but are now 1.

Let's go back to your data. Now, BA:Time2 carries all of the same information as BA:CI. How does this look in your Time2 ANOVA results...

> Anova(mod1, type = "III", singular.ok = TRUE)
Anova Table (Type III tests)

Response: Response
            Sum Sq Df  F value    Pr(>F)    
(Intercept)  73945  1 712.5963 < 2.2e-16 ***
BA             246  1   2.3678    0.1337    
CI            2484  1  23.9401 2.713e-05 ***
BA:CI            0  1   0.0046    0.9462    
BA:Time2        95  2   0.4570    0.6372    
BA:CI:Time2     37  2   0.1797    0.8364    
Residuals     3321 32

... SS for BA:CI is, as expected (more or less) 0. Contrast this with the Time model...

> Anova(mod, type = "III", singular.ok = TRUE)
Anova Table (Type III tests)

Response: Response
            Sum Sq Df   F value    Pr(>F)    
(Intercept) 107772  1 1038.5835 < 2.2e-16 ***
BA             209  1    2.0099    0.1659    
CI            4220  1   40.6655 3.661e-07 ***
BA:CI            9  1    0.0907    0.7653    
BA:Time         95  2    0.4570    0.6372    
BA:CI:Time      37  2    0.1797    0.8364    
Residuals     3321 32

BA:CI got to hold onto a bit of variance... and the rest of the model seemed to have done better as well.

My sense is that you have a poorly specified data set for ANOVA under all coding schemes. Level B of AB is confounded with one level of your time group under both ways of coding things. Also take special note of the comments in package:car for Anova, particularly with regard to the singular.ok arguement. In short it says 'defaults to TRUE for type-II tests, and FALSE for type-III tests (where the tests for models with aliased coefficients will not be straightforwardly interpretable)' ... and here you are with seemingly aliased coefficients.

... nevermind that Fox's description of what he has done with Type II and Type III tests is... challenging to interpret. This is a reminder for me as to why I always used package:ez for ezANOVA() back in the day when I had to tolerate non-type I testing.

110

answered Sep 29 '22 14:09

russellpierce

Related questions
                            
                                Dummy variables and preProcess
                            
                                randomForestSRC predicted values
                            
                                calculate the frequency in R
                            
                                How to authenticate users of Shiny apps on shinyapps.io for Google Sheets (googlesheets)
                            
                                Why neural networks are failing in a simple classifcation case
                            
                                How to make a reproducible example of a database connection?
                            
                                Fixing pandoc "out of memory" error when running the profvis R package
                            
                                Error in setInternet2(TRUE) : use != NA is defunct
                            
                                R notebook: opts_chunk has no effect
                            
                                Inverse diagonal on matrix
                            
                                "GC overhead limit exceeded" on cache of large dataset into spark memory (via sparklyr & RStudio)
                            
                                R 3.3.2: lme4 + lmerTest problems under Mac OS Sierra
                            
                                Removing outliers from boxplot and plotly
                            
                                R - lfe (felm) fit model with fixed effects only
                            
                                Great circles Map within a country in R
                            
                                Legend colorbar is truncated and limits not respected with continuous scales
                            
                                What is causing tidyr warning message: "attributes are not identical across measure variables"
                            
                                Comment out some chunks/part of Rmd file
                            
                                Default name concatenation
                            
                                Data.table join with index delivers unexpected results if indexed column name is a prefix of the join column name

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Nested ANOVA unique factor levels

Tags:

r

anova

user2602640

People also ask

1 Answers

russellpierce

Recent Activity

Donate For Us