Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL query multiple ranges without using multiple OR clauses (nesting LIKE, BETWEEN)

I commonly have to query an extremely large table for multiple variables with many conditions per variable. Often times, a variable will need to be queried for multiple ranges. For example, I could need all records of VAR_1 where VAR_1 is between 200-300, 350-400, 450-500.

Normally I would write this as follows, but have been told that using IN() instead of the multiple ORs would be much more efficient.

SELECT * FROM table
WHERE VAR_1 BETWEEN '200' AND '300' OR
      VAR_1 BETWEEN '350' AND '400' OR
      VAR_1 BETWEEN '450' AND '500'

Is there any way to condense this information and get rid of the ORs by nesting LIKE or BETWEEN clauses within an IN()? Something along the lines of:

WHERE VAR_1 IN (BETWEEN '200' AND '300', BETWEEN '350' AND '400', BETWEEN '450' AND '500')

or

WHERE VAR_1 IN ('[200-300]','[350-400]','[450-500]')

I have tried things like these, but the syntax is clearly incorrect. Any ideas or directions you can point me in would be great, still very new to SQL.

like image 908
Tucker Avatar asked May 31 '12 13:05

Tucker


1 Answers

Some comments deny that an IN clause can be more efficient than using OR - which i believe is incorrect for the general case.

It simply depends on the query optimizer of the database.

If the query optimizer is clever enough, it will transforms the OR and IN clause to the same execution plan (if they are semantically equal). In this case, the further plan processing will have the same costs for both plans, one small difference might be that the costs of the transformation might incur a small difference.

On the other hand, a query optimizer might not notice the strong correlation of the OR predicates and evaluate each range predicates independently (each one being a separate operator) in an execution plan, whereas an IN clause might be handled by a lookup table of all values in the in clause (only one operator) resulting in a noticeable runtime difference.

So the overall answer is it heavily depends on your DBMS, i would suggest that you try out both versions in a small benchmark setting on your system.

For SQL-92 there is no special syntax for something like

WHERE VAR_1 IN ('[200-300]','[350-400]','[450-500]')

so indeed you have only OR's or IN lists.

Please notice that the more general notation for

WHERE VAR_1 IN ( '200', '201' )

is

WHERE VAR_1 = ANY ('200','201' )

( the operator "=" can be replaced by any other comparison operator, e.g. "<=" )

like image 91
eci Avatar answered Sep 28 '22 08:09

eci