Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: GAM with fit on subset of data

Tags:

r

mgcv

gam

I fit a Generalized Additive Model using gam from the mgcv package. I have a data table containing my dependent variable Y, an independent variable X, other independent variables Oth and a two-level factor Fac. I would like to fit the following model

Y ~ s(X) + Oth

BUT with the additional constraint that the s(X) term is fit only on one of the two levels of the factor, say Fac==1. The other terms Oth should be fit with the whole data.

I tried exploring s(X,by=Fac) but this biases the fit for Oth. In other words, I would like to express the belief that X relates to Y only if Fac==1, otherwise it does not make sense to model X.

like image 953
yannick Avatar asked Dec 07 '15 14:12

yannick


2 Answers

Cheap trick: use an auxiliary variable that is X if Fac == 1 and 0 elsewhere.

library("mgcv")
library("ggplot2")


# simulate data

N <- 1e3

dat <- data.frame(covariate = runif(N),
                  predictor = runif(N),
                  group = factor(sample(0:1, N, TRUE)))

dat$outcome <- rnorm(N,
                     1 * dat$covariate +
                     ifelse(dat$group == 1,
                            .5 * dat$predictor +
                            1.5 * sin(dat$predictor * pi),
                            0), .1)

# some plots

ggplot(dat, aes(x = predictor, y = outcome,
                col = group, group = group)) +
    geom_point()

ggplot(dat, aes(x = covariate, y = outcome,
                col = group, group = group)) +
    geom_point()

# create auxiliary variable

dat$aux <- ifelse(dat$group == 1,
                  dat$predictor,
                  0)

# fit the data

fit1 <- gam(outcome ~ covariate + s(predictor, by = group),
            data = dat)

fit2 <- gam(outcome ~ covariate + s(aux, by = group),
            data = dat)

# compare fits

summary(fit1)

summary(fit2)
like image 102
qenvio Avatar answered Oct 21 '22 07:10

qenvio


If I understand it right, you're thinking about some model with interaction like this:

Y ~ 0th + (Fac==1)*s(X)  

If you want to "express the belief that X relates to Y only if Fac==1" don't treat Fac as a factor, but as a numeric variable. In this case you will get numeric interaction and only one set of coefficients (when it's a factor there where two). This type of model is a varying coefficient model.

# some data
data <- data.frame(th = runif(100),
              X = runif(100),
              Y = runif(100),
              Fac = sample(0:1, 100, TRUE))
data$Fac<-as.numeric(as.character(data$Fac)) #change to numeric
# then run model
gam(Y~s(X, by=Fac)+th,data=data)

See the documentation for by option in the documentation ?s

like image 31
Maju116 Avatar answered Oct 21 '22 07:10

Maju116