Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Looping through covariates in regression using R

I'm trying to run 96 regressions and save the results as 96 different objects. To complicate things, I want the subscript on one of the covariates in the model to also change 96 times. I've almost solved the problem but I've unfortunately hit a wall. The code so far is,

for(i in 1:96){

  assign(paste("z.out", i,sep=""), lm(rMonExp_EGM~ TE_i + Month2+Month3+Month4+Month5+Month6+Month7+Month8+Month9+
  Month10+Month11+Month12+Yrs_minus_2004 + 
  as.factor(LGA),data=Pokies))

}

This works on the object creation side (e.g. I have z.out1 - z.out96) but I can't seem to get the subscript on the covariate to change as well.

I have 96 variables called TE_1, TE_2 ... TE_96 in the dataset. As such, the subscript on TE_, the "i" needs to change to correspond to each of the objects I create. That is, z.out1 should hold the results from this model:

z.out1 <- lm(rMonExp_EGM~ TE_1 + Month2+Month3+Month4+Month5+Month6+Month7+Month8+Month9+
  Month10+Month11+Month12+Yrs_minus_2004 + as.factor(LGA),data=Pokies)

And z.out96 should be:

z.out96 <- lm(rMonExp_EGM~ TE_96+ Month2+Month3+Month4+Month5+Month6+Month7+Month8+Month9+
  Month10+Month11+Month12+Yrs_minus_2004 + as.factor(LGA),data=Pokies)

Hopefully this makes sense. I'm grateful for any tips/advice.

like image 241
kpeyton Avatar asked Nov 09 '12 04:11

kpeyton


People also ask

Can you have a covariate in multiple regression?

Introducing a covariate to a multiple regression model is very similar to conducting sequential multiple regression (sometimes called hierarchical multiple regression). In each of these situations, blocks are used to enter specific variables (be they predictors or covariates) into the model in chunks.


1 Answers

I would put the results in a list and avoid the for loop and assign statements

You can use a combination of reformulate and update to create your formula

orig_formula <- MonExp_EGM~ Month2+Month3+Month4+Month5+Month6+Month7+Month8+Month9+
 Month10+Month11+Month12+Yrs_minus_2004 + as.factor(LGA)


te_variables <- paste0('TE_', 1:96) 
# Or if you don't have a current version of R
# te_variables <- paste('TE', 1:96, sep = '_')  

 new_formula <- lapply(te_variables, function(x,orig = orig_formula) { 
    new <- reformulate(c(x,'.'))
    update(orig, new)})
 ## it works!    
new_formula[[1]]
## MonExp_EGM ~ TE_1 + Month2 + Month3 + Month4 + Month5 + Month6 + 
##   Month7 + Month8 + Month9 + Month10 + Month11 + Month12 + 
##   Yrs_minus_2004 + as.factor(LGA)
new_formula[[2]]
## MonExp_EGM ~ TE_2 + Month2 + Month3 + Month4 + Month5 + Month6 + 
## Month7 + Month8 + Month9 + Month10 + Month11 + Month12 + 
## Yrs_minus_2004 + as.factor(LGA)


models <- lapply(new_formula, lm, data = pokies)

There should now be 96 elements in the list models

You can name them to reflect your originally planned nnames

names(models) <- paste0('z.out', 1:96)
# or if you don't have a current version of R
# names(models) <-paste('z.out', 1:96 ,sep = '' )  

and then access a single model by

 models$z.out5

etc

or create summaries of all of the models

 summaries <- lapply(models, summary)

etc....

 # just the coefficients
 coefficients <- lapply(models, coef)

 # the table with coefficient estimates and standard.errors

 coef_tables <- apply(summaries, '[[', 'coefficients')
like image 126
mnel Avatar answered Oct 15 '22 10:10

mnel