Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using R to do a regression with multiple dependent and multiple independent variables

Tags:

r

I am trying to do a regression with multiple dependent variables and multiple independent variables. Basically I have House Prices at a county level for the whole US, this is my IV. I then have several other variables at a county level (GDP, construction employment), these constitute my dependent variables. I would like to know if there is an efficient way to do all of these regressions at the same time. I am trying to get:

lm(IV1 ~ DV11 + DV21)
lm(IV2 ~ DV12 + DV22)

I would like to do this for each independent and each dependent variable.

EDIT: The OP added this information in response to my answer, now deleted, which misunderstood the question.

I don't think I explained this question very well, I apologize. Every dependent variable has 2 independent variables associated with it, that unique. So if I have 500 dependent variables, I have 500 unique independent variable 1, and 500 unique independent variable 2.

Ok, I will try once more, if I fail to explain myself again I may just give up (haha). I don't know what you mean by mtcars from R though [this is in reference to Metrics's answer], so let me try it this way. I'm going to have 3 vectors of data roughly 500 rows in each one. I'm trying to build a regression out of each row of data. Let's say vector 1 is my dependent variable (the one I'm trying to predict), and vectors 2 and 3 make up my independent variables. So the first regression would consist of the row 1 value for each vector, the 2nd would consist of the row 2 value for each one and so on. Thank you all again.

like image 594
user2355903 Avatar asked Aug 05 '13 20:08

user2355903


1 Answers

I am assuming you have dataframe as mydata.

mydata<-mtcars #mtcars is the data in R

dep<-c("mpg~","cyl~","disp~") # list of unique dependent variables with ~ 
indep1<-c("hp","drat","wt")  # list of first unique independent variables 
indep2<-c("qsec","vs","am") # list of second unique independent variables 
> myvar<-cbind(dep,indep1,indep2) # matrix of variables
> myvar
     dep     indep1 indep2
[1,] "mpg~"  "hp"   "qsec"
[2,] "cyl~"  "drat" "vs"  
[3,] "disp~" "wt"   "am" 



for (i in 1:dim(myvar)[1]){
print(paste("This is", i, "regression", "with dependent var",gsub("~","",myvar[i,1])))
k[[i]]<-lm(as.formula(paste(myvar[i,1],paste(myvar[i,2:3],collapse="+"))),mydata)
print(k[[i]]
}



 [1] "This is 1 regression with dependent var mpg"

Call:
lm(formula = as.formula(paste(myvar[i, 1], paste(myvar[i, 2:3], 
    collapse = "+"))), data = mydata)

Coefficients:
(Intercept)           hp         qsec  
   48.32371     -0.08459     -0.88658  

[1] "This is 2 regression with dependent var cyl"

Call:
lm(formula = as.formula(paste(myvar[i, 1], paste(myvar[i, 2:3], 
    collapse = "+"))), data = mydata)

Coefficients:
(Intercept)         drat           vs  
     12.265       -1.421       -2.209  

[1] "This is 3 regression with dependent var disp"

Call:
lm(formula = as.formula(paste(myvar[i, 1], paste(myvar[i, 2:3], 
    collapse = "+"))), data = mydata)

Coefficients:
(Intercept)           wt           am  
    -148.59       116.47        11.31  

Note: You can use the same process for the large number of variables.

Alternative approach:

Motivated by Hadley's answer here, I use function Map to solve above problem:

dep<-list("mpg~","cyl~","disp~") # list of unique dependent variables with ~ 
indep1<-list("hp","drat","wt")  # list of first unique independent variables 
indep2<-list("qsec","vs","am") # list of second unique independent variables
Map(function(x,y,z) lm(as.formula(paste(x,paste(list(y,z),collapse="+"))),data=mtcars),dep,indep1,indep2)
[[1]]

Call:
lm(formula = as.formula(paste(x, paste(list(y, z), collapse = "+"))), 
    data = mtcars)

Coefficients:
(Intercept)           hp         qsec  
   48.32371     -0.08459     -0.88658  


[[2]]

Call:
lm(formula = as.formula(paste(x, paste(list(y, z), collapse = "+"))), 
    data = mtcars)

Coefficients:
(Intercept)         drat           vs  
     12.265       -1.421       -2.209  


[[3]]

Call:
lm(formula = as.formula(paste(x, paste(list(y, z), collapse = "+"))), 
    data = mtcars)

Coefficients:
(Intercept)           wt           am  
    -148.59       116.47        11.31  
like image 65
Metrics Avatar answered Nov 02 '22 23:11

Metrics