I am using a the RSQLite
library in R in to manage a data set that is too large for RAM. For each regression I query the database to retrieve a fiscal year at a time. Now I have the fiscal year hard-coded:
data.annual <- dbGetQuery(db, "SELECT * FROM annual WHERE fyear==2008")
I would like to make the fiscal year (2008 above) to make changes a bit easier (and fool-proof). Is there a way that I can pass a variable into SQL query string? I would love to use:
fiscal.year <- 2008
data.annual <- dbGetQuery(db, "SELECT * FROM annual WHERE fyear==fiscal.year")
SQLite will only see the string passed down for the query, so what you do is something like
sqlcmd <- paste("SELECT * FROM annual WHERE fiscal=", fiscal.year, sep="")
data.annual <- dbGetQuery(db, sqlcmd)
The nice thing is that you can use this the usual way to unwind loops. Forgetting for a second that you have ram restrictions, conceptually you can do
years <- seq(2000,2010)
data <- lapply(years, function(y) {
dbGetQuery(db, paste("SELECT * FROM annual WHERE fiscal=", y, sep="")
}
and now data is a list containing all your yearly data sets. Or you could keep the data, run your regression and only store the summary object.
Dirk's answer is spot on. One little thing I try to do is change the formatting for easy testing. Seems I have to cut and paste the SQL text into an SQL editor many times. So I format like this:
sqlcmd <- paste("
SELECT *
FROM annual
WHERE fiscal=
", fiscal.year, sep="")
data.annual <- dbGetQuery(db, sqlcmd)
This just makes it easier to cut and paste the SQL bits in/out for testing in your DB query environment. No big deal with a short query, but can get cumbersome with a longer SQL string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With