Possible Duplicate:
Specifying formula in R with glm without explicit declaration of each covariate
how to succinctly write a formula with many variables from a data frame?
I have a vector of Y values and a matrix of X values that I want to perform a multiple regression on (i.e. Y = X[column 1] + X[column 2] + ... X[column N])
The problem is that the number of columns in my matrix (N) is not prespecified. I know in R, to perform a linear regression you have to specify the equation:
fit = lm(Y~X[,1]+X[,2]+X[,3])
But how do I do this if I don't know how many columns are in my X matrix?
Thanks!
Three ways, in increasing level of flexibility.
Method 1
Run your regression using the formula notation:
fit <- lm( Y ~ . , data=dat )
Method 2
Put all your data in one data.frame, not two:
dat <- cbind(data.frame(Y=Y),as.data.frame(X))
Then run your regression using the formula notation:
fit <- lm( Y~. , data=dat )
Method 3
Another way is to build the formula yourself:
model1.form.text <- paste("Y ~",paste(xvars,collapse=" + "),collapse=" ")
model1.form <- as.formula( model1.form.text )
model1 <- lm( model1.form, data=dat )
In this example, xvars is a character vector containing the names of the variables you want to use.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With