I am working with a hierarchical model (lmerMod) that has an outcome and two categorical variables, cat1 and cat2. I want to make an interaction plot using ggplot2:: that visualizes the model coefficients and takes into account two specific complexities:
Pairwise comparisons: I need to compare the coefficients for each level of cat2 (Control, T1, T2, T3) against each other within each level of cat1 (1 and 2).
Statistical significance: I want to highlight the statistical significance of these comparisons using asterisks or some similar notation. Examples are ggsignif or ggpval packages, or ggpubr::stat_compare_means() function.
pointrange.cat1 should be on the x-axis.cat2 should define the color aesthetic.lmerTest package for model fitting (to get correct p-values).ggeffects::hypothesis_test() takes a very long time for large datasets (N=46,000, G1=4,000, G2=40).ggplot solutions for having brackets (significance stars can be places with geom_text() and ifelse() I guess)Here's a simple R example using a toy dataset:
# Load required packages
library(lmerTest)
library(ggplot2)
# Create toy data
set.seed(123)
N <- 1000 # Increased sample size
G1 <- 20 # Increased number of levels for G1
G2 <- 10 # Increased number of levels for G2
df <- data.frame(
outcome = rnorm(N),
cat1 = factor(rep(1:2, each = N / 2)),
cat2 = factor(rep(c("Control", "T1", "T2", "T3"), each = N / 4)),
G1 = factor(rep(1:G1, each = N / G1)),
G2 = factor(rep(1:G2, each = N / G2))
)
# Fit the hierarchical model
fit <- lmerTest::lmer(outcome ~ cat1 * cat2 + (1|G1) + (1|G2), data = df)
summary(fit)
# Attempt to make the plot
interactions::cat_plot(model = fit, pred = cat1, modx = cat2,
interval.geom = "linerange", colors = "Set1")
# ... (I want significance stars and brackets connecting the coefficients: this is where I am stuck)

First, note that there are two problems with your data generation, when fitting the model
fixed-effect model matrix is rank deficient so dropping 4 columns / coefficients
boundary (singular) fit: see help('isSingular')
The first issue arises because there is perfect multicollinearity in the predictors. This means one or more predictors can be perfectly predicted using other predictors in the model. When this happens, the matrix of predictors (often called the design matrix) cannot be inverted, and the model coefficients cannot be uniquely estimated.
The second is likely due to there being too little variation in one of the random effects. for the sake of this answer I will just omit the random effect for G2. So, hopefully, I think we can achieve what you want with the following code:
library(lmerTest)
library(ggplot2)
library(effects)
# Create toy data
set.seed(123)
N <- 1000
G1 <- 20
G2 <- 10
df <- data.frame(
outcome = rnorm(N),
cat1 = factor(rep(1:2, each = N / 2)),
cat2 = factor(sample(c("Control", "T1", "T2", "T3"), N, replace = TRUE)), # Shuffle the cat2 levels randomly
G1 = factor(rep(1:G1, each = N / G1)),
G2 = factor(rep(1:G2, each = N / G2))
)
# Fit the hierarchical model with interaction
fit <- lmerTest::lmer(outcome ~ cat1 * cat2 + (1|G1) + (1|G2), data = df)
# Extract effects of interaction
eff <- effect("cat1:cat2", fit)
# Convert effect object to dataframe for plotting
eff_df <- as.data.frame(eff)
# Rename columns for ease of use
names(eff_df) <- c("cat1", "cat2", "fit", "lower", "upper")
# Visualization using ggplot2
ggplot(eff_df, aes(x = cat1, y = fit, color = cat2, group = cat2)) +
geom_point(aes(shape = cat2), size = 3) +
geom_line(aes(linetype = cat2)) +
geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.2) +
labs(y = "Predicted Outcome", title = "Interaction Effect of cat1 and cat2") +
theme_minimal()
which generates the following plot:

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With