In ffbase (http://cran.r-project.org/web/packages/ffbase/ffbase.pdf) there is the bigglm function:
bigglm.ffdf(formula, data, family = gaussian(), ...,
where formula is something like Y~X, assuming Y and X correspond to the colnames of ffdf object called data.
What if I have 200 columns in data that I want to put on the RHS of the equation? Clearly I can't type Y~X1+X2+....+X200.
How do I run Y~X1+X2+....+X200 without typing out all 200 variables on the RHS?
the . symbol is the normal character for this, not sure if it works with ffbase though. I.e.
m <- lm(y ~ ., df)
will describe y by all other columns in df.
As described by Chris, this appears to be a bug in biglm, and can be worked around by using:
m <- bigglm(terms(y ~ ., data=df), data=df)
But this should be reported as a bug to the author of biglm.
If Sam's answer doesn't work, you can build up a character string representing the formula and then cast is as a formula:
formula <- as.formula(paste('Y', paste(paste('',
paste('X', 1:200, sep = ''), sep = '', collapse = ' + ')), sep = ' ~ '))
The inner paste creates X1 to X200. The next paste collapses the resulting vector into a single string with the elements of the first paste put together with +'s. The last paste adds on the Y ~. Finally, I change it from a string to a formula.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With