Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how do i exclude specific variables from a glm in R?

Tags:

r

statistics

glm

I have 50 variables. This is how I use them all in my glm.

var = glm(Stuff ~ ., data=mydata, family=binomial) 

But I want to exclude 2 of them. So how do I exclude 2 in specific? I was hoping there would be something like this:

var = glm(Stuff ~ . # notthisstuff, data=mydata, family=binomial) 

thoughts?

like image 909
user3399551 Avatar asked Mar 22 '14 16:03

user3399551


People also ask

How do I exclude one variable in r?

To exclude variables from dataset, use same function but with the sign - before the colon number like dt[,c(-x,-y)] . Sometimes you need to exclude observation based on certain condition. For this task the function subset() is used.

What is the syntax of GLM () in R?

Syntax: glm (formula, family, data, weights, subset, Start=null, model=TRUE,method=””…) Here Family types (include model types) includes binomial, Poisson, Gaussian, gamma, quasi. Each distribution performs a different usage and can be used in either classification and prediction.


1 Answers

In addition to using the - like in the comments

glm(Stuff ~ . - var1 - var2, data= mydata, family=binomial) 

you can also subset the data frame passed in

glm(Stuff ~ ., data=mydata[ , !(names(mydata) %in% c('var1','var2'))], family=binomial) 

or

glm(Stuff ~ ., data=subset(mydata, select=c( -var1, -var2 ) ), family=binomial ) 

(be careful with that last one, the subset function sometimes does not work well inside of other functions)

You could also use the paste function to create a string representing the formula with the terms of interest (subsetting to the group of predictors that you want), then use as.formula to convert it to a formula.

like image 54
Greg Snow Avatar answered Sep 21 '22 10:09

Greg Snow