I am researching how to read in data from a server directly to a data frame in R. In the past I have written SQL queries that were over 50 lines long (with all the selects and joins). Any advice on how to write long queries in R? Is there some way to write the query elsewhere in R, then paste it in to the "sqlQuery" part of the code?
Keep long SQL queries in .sql files and read them in using readLines + paste with collapse='\n'
my_query <- paste(readLines('your_query.sql'), collapse='\n')
results <- sqlQuery(con, my_query)
You can paste any SQL query into R as is and then simply replace the newlines + spaces with a single space. For instance:
## Connect ot DB
library(RODBC)
con <- odbcConnect('MY_DB')
## Paste the query as is (you can have how many spaces and lines you want)
query <-
"
SELECT [Serial Number]
,[Series]
,[Customer Name]
,[Zip_Code]
FROM [dbo].[some_db]
where [Is current] = 'Yes' and
[Serial Number] LIKE '5%' and
[Series] = '20'
order by [Serial Number]
"
## Simply replace the new lines + spaces with a space and you good to go
res <- sqlQuery(con, gsub("\\n\\s+", " ", query))
close(con)
Approach with separate .sql (most sql or nosql engines) files can be trouble if one prefer to edit code in one file.
As far as someone using RStudio (or other tool where code folding can be customized), simplifying can be done using parenthesis. I prefer using {...}
and fold the code.
For example:
query <- {'
SELECT
user_id,
format(date,"%Y-%m") month,
product_group,
product,
sum(amount_net) income,
count(*) number
FROM People
WHERE
date > "2015-01-01" and
country = "Canada"
GROUP BY 1,2,3;'})
Folding a query can be even done within function (folding long argument), or in other situations where our code extends to inconvenient sizes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With