I've looked at the 'Pass R variable to RODBC's sqlQuery with multiple entries? ' already but can't seem to get it to work. I'm trying to do an sqlQuery()
from R on a SQL Server 2008 R2 db. I'm trying to get a sample from a large db based on row numbers. First I created a list of random numbers:
sampRowNum <- sample(seq(1,100000,1), 5000)
Then I try to use those numbers in a query using:
query1 <- sqlQuery(channel, paste("select *
FROM db where row_id in (", sampRowNum,")", sep=""))
I get just the results from the db where the row_id
is equal to the first number in sampRowNum
. Any suggestions?
To use SQL, open an R Notebook in the RStudio IDE under the File > New File menu. Start a new code chunk with {sql} , and specify your connection with the connection=con code chunk option. If you want to send the query output to an R dataframe, use output. var = "mydataframe" in the code chunk options.
The RSQLite package allows R to interface with SQLite databases. This command does not load the data into the R session (as the read_csv() function did). Instead, it merely instructs R to connect to the SQLite database contained in the portal_mammals. sqlite file.
You're not paste
ing your query together correctly.
If you run the paste
statement in isolation, you'll see that you get a vector of length 5000, so sqlQuery
is only executing the first one of those, corresponding to the first element in samRowNum
.
What you want to do is something more like this:
paste("select * FROM db where row_id in (",
paste(sampRowNum,collapse = ","),")", sep="")
Just as an added note (and since I've had to do stuff like this a lot...) constructing sql queries with an IN
clause with strings is a bit more of a nuisance, since you have to tack on all the single quotes:
vec <- letters[1:5]
paste("SELECT * FROM db WHERE col IN ('",
paste(vec,collapse = "','"),"')",sep = "")
[1] "SELECT * FROM db WHERE col IN ('a','b','c','d','e')"
If you do this a lot, you'll end up writing a little function that does that pasting of character vectors for you.
As always, this kind of SQL string manipulation is Not Good if you are dealing with user inputs (e.g. in a web app), due to SQL injection attacks. In my particular situation this isn't much of a concern, but in general people will prefer parametrized queries if you don't have much control over the input values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With