Variable Selection with mgcv

Tags:

Is there a way of automating variable selection of a GAM in R, similar to step? I've read the documentation of step.gam and selection.gam, but I've yet to see a answer with code that works. Additionally, I've tried method= "REML" and select = TRUE, but neither remove insignificant variables from the model.

I've theorized that I could create a step model and then use those variables to create the GAM, but that does not seem computationally efficient.

Example:

Click to copy

library(mgcv)

set.seed(0)
dat <- data.frame(rsp = rnorm(100, 0, 1), 
                  pred1 = rnorm(100, 10, 1), 
                  pred2 = rnorm(100, 0, 1), 
                  pred3 = rnorm(100, 0, 1), 
                  pred4 = rnorm(100, 0, 1))

model <- gam(rsp ~ s(pred1) + s(pred2) + s(pred3) + s(pred4),
             data = dat, method = "REML", select = TRUE)

summary(model)

#Family: gaussian 
#Link function: identity 

#Formula:
#rsp ~ s(pred1) + s(pred2) + s(pred3) + s(pred4)

#Parametric coefficients:
#            Estimate Std. Error t value Pr(>|t|)
#(Intercept)  0.02267    0.08426   0.269    0.788

#Approximate significance of smooth terms:
#            edf Ref.df     F p-value  
#s(pred1) 0.8770      9 0.212  0.1174  
#s(pred2) 1.8613      9 0.638  0.0374 *
#s(pred3) 0.5439      9 0.133  0.1406  
#s(pred4) 0.4504      9 0.091  0.1775  
---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

#R-sq.(adj) =  0.0887   Deviance explained = 12.3%
#-REML = 129.06  Scale est. = 0.70996   n = 100

835

asked Jul 25 '16 14:07

IJH

1 Answers

Marra and Wood (2011, Computational Statistics and Data Analysis 55; 2372-2387) compare various approaches for feature selection in GAMs. They concluded that an additional penalty term in the smoothness selection procedure gave the best results. This can be activated in mgcv::gam() by using the select = TRUE argument/setting, or any of the following variations:

Click to copy

model <- gam(rsp ~ s(pred1,bs="ts") + s(pred2,bs="ts") + s(pred3,bs="ts") + s(pred4,bs="ts"), data = dat, method = "REML")
model <- gam(rsp ~ s(pred1,bs="cr") + s(pred2,bs="cr") + s(pred3,bs="cr") + s(pred4,bs="cr"),
             data = dat, method = "REML",select=T)
model <- gam(rsp ~ s(pred1,bs="cc") + s(pred2,bs="cc") + s(pred3,bs="cc") + s(pred4,bs="cc"),
             data = dat, method = "REML")
model <- gam(rsp ~ s(pred1,bs="tp") + s(pred2,bs="tp") + s(pred3,bs="tp") + s(pred4,bs="tp"), data = dat, method = "REML")

answered Oct 15 '22 06:10

Hack-R

Related questions
                            
                                Prevent shiny from updating elements of plotly
                            
                                How to parse javascript data list with R
                            
                                Unzip password protected zip files in R
                            
                                Determining the size of grid cells in a raster
                            
                                Change error message with Shiny app
                            
                                How to get covariance matrix for random effects (BLUPs/conditional modes) from lme4
                            
                                Stacked bar graphs in plotly: how to control the order of bars in each stack
                            
                                R - invert gsub: keep only matches with gsub argument [duplicate]
                            
                                Remove characters preceding first instance of a capital letter in string in R
                            
                                error with a function to retrieve data from a database
                            
                                counting N occurrences within a ceiling range of a matrix by-row
                            
                                How to regress multiple series on single series conditioned on grouping variables?
                            
                                Bidirectional bar chart with positive labels on both sides ggplot2
                            
                                Ubuntu remove .libPaths() in R
                            
                                How to dynamically remove a loaded object in R when object name is unknown?
                            
                                geom_bar ggplot2 stacked, grouped bar plot with positive and negative values - pyramid plot
                            
                                How do I create a reactive plot using ggplot in Shiny application
                            
                                Use ggplot to plot over an image with legend
                            
                                How to show more bubble sizes in legend of ggplot?
                            
                                add a data frame to an existing rdata file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Variable Selection with mgcv

Tags:

r

mgcv

gam

IJH

People also ask

1 Answers

Hack-R

Recent Activity

Donate For Us