Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MYSQL OR vs IN performance

People also ask

What is MySQL performance?

Software MySQL Performance Tuning. SQL performance tuning is the process of maximizing query speeds on a relational database. The task usually involves multiple tools and techniques. These methods involve: Tweaking the MySQL configuration files.

Is MySQL the fastest?

For example, Windows Skills says that MySQL is faster, and Benchw says that PostgreSQL is faster. Ultimately, speed will depend on the way you're using the database. PostgreSQL is known to be faster while handling massive data sets, complicated queries, and read-write operations.

Why is MySQL running slow?

Queries can become slow for various reasons ranging from improper index usage to bugs in the storage engine itself. However, in most cases, queries become slow because developers or MySQL database administrators neglect to monitor them and keep an eye on their performance.

What is MySQL Optimization?

Optimization involves configuring, tuning, and measuring performance, at several levels. Depending on your job role (developer, DBA, or a combination of both), you might optimize at the level of individual SQL statements, entire applications, a single database server, or multiple networked database servers.


I needed to know this for sure, so I benchmarked both methods. I consistenly found IN to be much faster than using OR.

Do not believe people who give their "opinion", science is all about testing and evidence.

I ran a loop of 1000x the equivalent queries (for consistency, I used sql_no_cache):

IN: 2.34969592094s

OR: 5.83781504631s

Update:
(I don't have the source code for the original test, as it was 6 years ago, though it returns a result in the same range as this test)

In request for some sample code to test this, here is the simplest possible use case. Using Eloquent for syntax simplicity, raw SQL equivalent executes the same.

$t = microtime(true); 
for($i=0; $i<10000; $i++):
$q = DB::table('users')->where('id',1)
    ->orWhere('id',2)
    ->orWhere('id',3)
    ->orWhere('id',4)
    ->orWhere('id',5)
    ->orWhere('id',6)
    ->orWhere('id',7)
    ->orWhere('id',8)
    ->orWhere('id',9)
    ->orWhere('id',10)
    ->orWhere('id',11)
    ->orWhere('id',12)
    ->orWhere('id',13)
    ->orWhere('id',14)
    ->orWhere('id',15)
    ->orWhere('id',16)
    ->orWhere('id',17)
    ->orWhere('id',18)
    ->orWhere('id',19)
    ->orWhere('id',20)->get();
endfor;
$t2 = microtime(true); 
echo $t."\n".$t2."\n".($t2-$t)."\n";

1482080514.3635
1482080517.3713
3.0078368186951

$t = microtime(true); 
for($i=0; $i<10000; $i++): 
$q = DB::table('users')->whereIn('id',[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])->get(); 
endfor; 
$t2 = microtime(true); 
echo $t."\n".$t2."\n".($t2-$t)."\n";

1482080534.0185
1482080536.178
2.1595389842987


I also did a test for future Googlers. Total count of returned results is 7264 out of 10000

SELECT * FROM item WHERE id = 1 OR id = 2 ... id = 10000

This query took 0.1239 seconds

SELECT * FROM item WHERE id IN (1,2,3,...10000)

This query took 0.0433 seconds

IN is 3 times faster than OR


The accepted answer doesn't explain the reason.

Below are quoted from High Performance MySQL, 3rd Edition.

In many database servers, IN() is just a synonym for multiple OR clauses, because the two are logically equivalent. Not so in MySQL, which sorts the values in the IN() list and uses a fast binary search to see whether a value is in the list. This is O(Log n) in the size of the list, whereas an equivalent series of OR clauses is O(n) in the size of the list (i.e., much slower for large lists)


I think the BETWEEN will be faster since it should be converted into:

Field >= 0 AND Field <= 5

It is my understanding that an IN will be converted to a bunch of OR statements anyway. The value of IN is the ease of use. (Saving on having to type each column name multiple times and also makes it easier to use with existing logic - you don't have to worry about AND/OR precedence because the IN is one statement. With a bunch of OR statements, you have to ensure you surround them with parentheses to make sure they are evaluated as one condition.)

The only real answer to your question is PROFILE YOUR QUERIES. Then you will know what works best in your particular situation.


It depends on what you are doing; how wide is the range, what is the data type (I know your example uses a numeric data type but your question can also apply to a lot of different data types).

This is an instance where you want to write the query both ways; get it working and then use EXPLAIN to figure out the execution differences.

I'm sure there is a concrete answer to this but this is how I would, practically speaking, figure out the answer for my given question.

This might be of some help: http://forge.mysql.com/wiki/Top10SQLPerformanceTips

Regards,
Frank


I think one explanation to sunseeker's observation is MySQL actually sort the values in the IN statement if they are all static values and using binary search, which is more efficient than the plain OR alternative. I can't remember where I've read that, but sunseeker's result seems to be a proof.