Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using IN with sets of tuples in SQL (SQLite3)

I have the following table in a SQLite3 database:

CREATE TABLE overlap_results (
neighbors_of_annotation varchar(20),
other_annotation varchar(20),
set1_size INTEGER,
set2_size INTEGER,
jaccard REAL,
p_value REAL,
bh_corrected_p_value REAL,
PRIMARY KEY (neighbors_of_annotation, other_annotation)
);

I would like to perform the following query:

SELECT * FROM overlap_results WHERE 
(neighbors_of_annotation, other_annotation)
IN (('16070', '8150'), ('16070', '44697'));

That is, I have a couple of tuples of annotation IDs, and I'd like to fetch records for each of those tuples. The sqlite3 prompt gives me the following error:

SQL error: near ",": syntax error

How do I properly express this as a SQL statement?


EDIT I realize I did not explain well what I am really after. Let me try another crack at this.

If a person gives me an arbitrary list of terms in neighbors_of_annotation that they're interested in, I can write a SQL statement like the following:

SELECT * FROM overlap_results WHERE 
neighbors_of_annotation
IN (TERM_1, TERM_2, ..., TERM_N);

But now suppose that person wants to give me pairs of terms if the form (TERM_1,1, TERM_1,2), (TERM_2,1, TERM_2,2), ..., (TERM_N,1, TERM_N,2), where TERM_i,1 is in neighbors_of_annotation and TERM_i,2 is in other_annotation. Does the SQL language provide an equally elegant way to formulate the query for pairs (tuples) of interest?

The simplest solution seems to be to create a new table, just for these pairs, and then join that table with the table to be queried, and select only the rows where the first terms and the second terms match. Creating tons of AND / OR statements looks scary and error prone.

like image 894
gotgenes Avatar asked Apr 08 '10 01:04

gotgenes


2 Answers

I've never seen SQL like that. If it exists, I would suspect it's a non-standard extension. Try:

SELECT * FROM overlap_results
WHERE neighbors_of_annotation = '16070'
AND   other_annotation = '8150'
UNION ALL SELECT * FROM overlap_results
WHERE neighbors_of_annotation = '16070'
AND   other_annotation = '44697';

In other words, build the dynamic query from your tuples but as a series of unions instead, or as a series of ANDs within ORs:

SELECT * FROM overlap_results
WHERE (neighbors_of_annotation = '16070' AND other_annotation =  '8150')
OR    (neighbors_of_annotation = '16070' AND other_annotation = '44697');

So, instead of code (pseudo-code, tested only in my head so debugging is your responsibility) such as:

query  = "SELECT * FROM overlap_results"
query += " WHERE (neighbors_of_annotation, other_annotation) IN ("
sep = ""
for element in list:
    query += sep + "('" + element.noa + "','" + element.oa + "')"
    sep = ","
query += ");"

you would instead have something like:

query  = "SELECT * FROM overlap_results "
sep = "WHERE "
for element in list:
    query += sep + "(neighbors_of_annotation = '" + element.noa + "'"
    query += " AND other_annotation = '" + element.oa + "')"
    sep = "OR "
query += ";"
like image 66
paxdiablo Avatar answered Sep 26 '22 06:09

paxdiablo


I'm not aware of any SQL dialects that support tuples inside IN clauses. I think you're stuck with:

SELECT * FROM overlap_results WHERE (neighbors_of_annotation = '16070' and other_annotation = '8150') or (neighbors_of_annotation = '16070' and other_annotation = '44697')

Of course, this particular query can be simplified to something like:

SELECT * FROM overlap_results WHERE neighbors_of_annotation = '16070' and (other_annotation = '8150' or other_annotation = '44697')

Generally SQL WHERE-clause predicates only allow filtering on a single-column.

like image 23
ig0774 Avatar answered Sep 26 '22 06:09

ig0774