I have a time-series data set that contains an outcome variable which is continuous and two factor predictors (one with 6 levels and one with 2 levels).
I would like to model the non-linear interaction of the two factor variables on the continuous variable.
This is the model I have so far:
library(mgcv)
model <- bam(
outcome ~
factor_1 + factor_2 +
s(time, k = 9) +
s(time, by = factor_1, k = 9) +
s(time, by = factor_2, k = 9),
data = df
)
summary(model)
Family: gaussian
Link function: identity
Formula:
outcome ~ factor_1 + factor_2 + s(time, k = 9) + s(time, by = factor_1,
k = 9) + s(time, by = factor_2, k = 9)
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2612.72 23.03 113.465 <2e-16 ***
factor_1b 33.19 27.00 1.229 0.22
factor_2z -488.52 27.00 -18.093 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(time) 2.564 3.184 6.408 0.000274 ***
s(time):factor_1b 1.000 1.001 0.295 0.587839
s(time):factor_2z 2.246 2.792 34.281 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-sq.(adj) = 0.679 Deviance explained = 69.1%
fREML = 1359.6 Scale est. = 37580 n = 207
Now I would like to add a non-linear interaction of factor_1
and factor_2
with time
for the effect on outcome
, so that the smoothers in every combination could differ (for example: factor_2
has a stronger non-linear effect for some levels of factor_1
). Something like s(time, factor_1, factor_2)
or s(time, factor_1, by = factor_2)
does not work.
Choosing the Smoothing Parameters Generalized cross validation criteria (GCV). Mixed model approach via restricted maximum likelihood (REML).
A GAM is a linear model with a key difference when compared to Generalised Linear Models such as Linear Regression. A GAM is allowed to learn non-linear features.
GAMs model relationships in data as nonlinear functions that are highly adaptable to different types of data science problems.
mgcv is an R package for estimating penalized Generalized Linear models including Generalized Additive Models and Generalized Additive Mixed Models. mgcv includes an implementation of 'gam', based on penalized regression splines with automatic smoothness estimation.
Including an interaction of two factors using interaction()
seems to do the job.
library(mgcv)
# The following assumes factors are ordered with treatment contrast.
model <- bam(
outcome ~
interaction(factor_1, factor_2) +
s(time, k = 9) +
s(time, by = interaction(factor_1, factor_2), k = 9),
data = df
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With