Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a performance difference between BETWEEN and IN with MySQL or in SQL in general?

Tags:

I have a set of consecutive rows I want to get based upon their primary key, which is an auto-incrementing integer. Assuming that there are no holes, is there any performance between between:

SELECT * FROM `theTable` WHERE `id` IN (n, ... nk);  

and:

SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk; 
like image 446
pr1001 Avatar asked Jul 22 '10 11:07

pr1001


People also ask

Which is faster between or in SQL?

Between is faster due to lesser comparisons. With IN clause each elements are traversed every time. But purpose of both are different: Between is used when you are comparing with Range of values in some kind of sequence.

What is the difference between between and in operators in MySQL?

Both of these operators are used to find out the multiple values from the table. Differences between these operator is that the BETWEEN operator is used to select a range of data between two values while The IN operator allows you to specify multiple values.

Is SQL faster than MySQL?

It was concluded that SQL Server offers better performance than MySQL in terms of response time. Except for the INSERT queries, SQL Server consistently took lesser time for all the other test cases as against MySQL. In terms of scaling up, MySQL showed two times increase in time when the number of rows went up.

Which is better and or between in SQL?

From a maintainability perspective, BETWEEN is probably better.


2 Answers

BETWEEN should outperform IN in this case (but do measure and check execution plans, too!), especially as n grows and as statistics are still accurate. Let's assume:

  • m is the size of your table
  • n is the size of your range

Index can be used (n is tiny compared to m)

  • In theory, BETWEEN can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at most n index leaf nodes. The complexity will be O(n + log m)

  • IN is usually implemented as a series (loop) of n "range scans" on the primary key index. With m being the size of the table, the complexity will always be O(n * log m) ... which is always worse (neglibile for very small tables m or very small ranges n)

Index cannot be used (n is a significant portion of m)

In any case, you'll get a full table scan and evaluate the predicate on each row:

  • BETWEEN needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity is O(m)

  • IN needs to evaluate at most n predicates. The complexity is O(m * n) ... which is again always worse, or perhaps O(m) if the database can optimise the IN list to be a hashmap, rather than a list of predicates.

like image 164
Lukas Eder Avatar answered Jan 02 '23 22:01

Lukas Eder


a between b and c is a macro that expands to b <= a and a <= c.

a in (b,c,d) is a macro that expands to a=b or a=c or a=d.

Assuming your n and nk are integer, both should end up meaning the same. The between variant should be much faster because it's only two compares, versus nk - n compares for the in variant.

like image 31
Andomar Avatar answered Jan 03 '23 00:01

Andomar