Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to specify the non-linear interaction of two factor variables in generalised additive models [R]

Tags:

r

mgcv

gam

I have a time-series data set that contains an outcome variable which is continuous and two factor predictors (one with 6 levels and one with 2 levels).

I would like to model the non-linear interaction of the two factor variables on the continuous variable.

This is the model I have so far:

library(mgcv)

model <- bam(
    outcome ~
        factor_1 + factor_2 +
        s(time, k = 9) +
        s(time, by = factor_1, k = 9) +
        s(time, by = factor_2, k = 9),
    data = df
)

summary(model)
Family: gaussian 
Link function: identity 

Formula:
outcome ~ factor_1 + factor_2 + s(time, k = 9) + s(time, by = factor_1, 
    k = 9) + s(time, by = factor_2, k = 9)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2612.72      23.03 113.465   <2e-16 ***
factor_1b      33.19      27.00   1.229     0.22    
factor_2z    -488.52      27.00 -18.093   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                    edf Ref.df      F  p-value    
s(time)           2.564  3.184  6.408 0.000274 ***
s(time):factor_1b 1.000  1.001  0.295 0.587839    
s(time):factor_2z 2.246  2.792 34.281  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.679   Deviance explained = 69.1%
fREML = 1359.6  Scale est. = 37580     n = 207

Now I would like to add a non-linear interaction of factor_1 and factor_2 with time for the effect on outcome, so that the smoothers in every combination could differ (for example: factor_2 has a stronger non-linear effect for some levels of factor_1). Something like s(time, factor_1, factor_2) or s(time, factor_1, by = factor_2) does not work.

like image 954
Stefano Avatar asked Dec 10 '17 14:12

Stefano


People also ask

What is Reml in GAM?

Choosing the Smoothing Parameters Generalized cross validation criteria (GCV). Mixed model approach via restricted maximum likelihood (REML).

Are generalized additive models linear?

A GAM is a linear model with a key difference when compared to Generalised Linear Models such as Linear Regression. A GAM is allowed to learn non-linear features.

Is GAM non linear?

GAMs model relationships in data as nonlinear functions that are highly adaptable to different types of data science problems.

What is MGCV?

mgcv is an R package for estimating penalized Generalized Linear models including Generalized Additive Models and Generalized Additive Mixed Models. mgcv includes an implementation of 'gam', based on penalized regression splines with automatic smoothness estimation.


1 Answers

Including an interaction of two factors using interaction() seems to do the job.

library(mgcv)

# The following assumes factors are ordered with treatment contrast.    
model <- bam(
    outcome ~
        interaction(factor_1, factor_2) +
        s(time, k = 9) +
        s(time, by = interaction(factor_1, factor_2), k = 9),
    data = df
)
like image 173
Stefano Avatar answered Oct 09 '22 08:10

Stefano