Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R RODBC putting list of numbers into an IN() statement

Tags:

sql

r

rodbc

I've looked at the 'Pass R variable to RODBC's sqlQuery with multiple entries? ' already but can't seem to get it to work. I'm trying to do an sqlQuery() from R on a SQL Server 2008 R2 db. I'm trying to get a sample from a large db based on row numbers. First I created a list of random numbers:

sampRowNum <- sample(seq(1,100000,1), 5000)

Then I try to use those numbers in a query using:

query1 <- sqlQuery(channel, paste("select *      
  FROM db where row_id in (", sampRowNum,")", sep=""))

I get just the results from the db where the row_id is equal to the first number in sampRowNum. Any suggestions?

like image 895
screechOwl Avatar asked Dec 26 '11 21:12

screechOwl


People also ask

How do I pull SQL data into R?

To use SQL, open an R Notebook in the RStudio IDE under the File > New File menu. Start a new code chunk with {sql} , and specify your connection with the connection=con code chunk option. If you want to send the query output to an R dataframe, use output. var = "mydataframe" in the code chunk options.

Which command is used to connect R to database?

The RSQLite package allows R to interface with SQLite databases. This command does not load the data into the R session (as the read_csv() function did). Instead, it merely instructs R to connect to the SQLite database contained in the portal_mammals. sqlite file.


1 Answers

You're not pasteing your query together correctly.

If you run the paste statement in isolation, you'll see that you get a vector of length 5000, so sqlQuery is only executing the first one of those, corresponding to the first element in samRowNum.

What you want to do is something more like this:

paste("select * FROM db where row_id in (", 
    paste(sampRowNum,collapse = ","),")", sep="")

Just as an added note (and since I've had to do stuff like this a lot...) constructing sql queries with an IN clause with strings is a bit more of a nuisance, since you have to tack on all the single quotes:

vec <- letters[1:5]

paste("SELECT * FROM db WHERE col IN ('",
    paste(vec,collapse = "','"),"')",sep = "")

[1] "SELECT * FROM db WHERE col IN ('a','b','c','d','e')"

If you do this a lot, you'll end up writing a little function that does that pasting of character vectors for you.

As always, this kind of SQL string manipulation is Not Good if you are dealing with user inputs (e.g. in a web app), due to SQL injection attacks. In my particular situation this isn't much of a concern, but in general people will prefer parametrized queries if you don't have much control over the input values.

like image 104
joran Avatar answered Sep 22 '22 16:09

joran