Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can the order of criteria in a WHERE clause affect performance in MySQL?

Given the following query:

select ....................
from ......................
where 
(
      lower(c01) like ? 
      or lower(c02) like ? 
      or lower(c5x3_.c01) like ? 
      or lower(5x3_.c02) like ?
 )    
 and(
(
      lower(c03) like ? 
      or lower(c04) like ? 
      or lower(c5x3_.c03) like ? 
      or lower(5x3_.c04) like ?
 )    
 /* ...Multiple further LIKEs... */
 and status=1; /* status is a non-nullable value whose values can only be 1 or 2 */

The first 10 or 12 restrictions are LIKEs on various patterns - I know this can be slow so always put the status criterion first, i.e:

status=1;
 and(other LIKEs)

My question: Could there be a performance gain in applying the simpler restrictions first (e.g. in this case, the int value being compared can't be null value and only has 2 possible values)? Or will performance be the same if the LIKEs are put first and the status check appears later?

Let's suppose there are no indexes on any column to simplify the question.

like image 463
chiperortiz Avatar asked Nov 08 '16 14:11

chiperortiz


3 Answers

(Part answer, part asking for more info...)

It makes no difference whether you put status=1 first or last. Phrased another way, the order of ANDed things in a WHERE is unimportant.

LOWER(x) LIKE '...' is much less efficient than setting the collation of x to be case-folding, such as utf8_general_ci (the ci means "case insensitive"). With a ci collation, you need do only x LIKE '...'.

But, that won't speed things up much. You have lots of ORs; these are inefficient because an index is usually useless, thereby necessitating a full table scan.

Often, OR can be turned into UNION to be able to use indexes. But probably not in your case.

Please provide SHOW CREATE TABLE. Hmmm... The pattern of the query makes me wonder if you have a lot of 'identical' tables. If so, perhaps we can jump on that as an inefficient way to design a schema. So, please tell us if c01 and c03 (etc) look the same.

What kinds of values will be put into the LIKEs? Words? Numbers? Arbitrary strings? Wildcards? Leading wildcards? Not leading? If "words", then FULLTEXT might be useful. But... We will need to look hard at the long(?) list of tables involved.

like image 137
Rick James Avatar answered Sep 30 '22 04:09

Rick James


The order in which conditions are placed can make a difference because of MySQL short-circuiting. Here is an attempt to prove this:

http://rextester.com/HJGN96158

This runs the following queries:

SELECT myint FROM mytable WHERE myint >= 3 AND myslowfunction('query #1', myint) = 1;
SELECT myint FROM mytable WHERE myslowfunction('query #2', myint) = 1 AND myint >= 3;

The only difference between these is the order of operands in the AND condition.

myslowfunction deliberately sleeps for a second and has the side effect of adding an entry to a log table each time it is run. Here are the results of what is logged when running the above two queries:

myslowfunction called for query #1 with value 3
myslowfunction called for query #1 with value 4
myslowfunction called for query #2 with value 1
myslowfunction called for query #2 with value 2
myslowfunction called for query #2 with value 3
myslowfunction called for query #2 with value 4

The above shows that a slow function is executed more times when it appears on the left side of an AND condition when the other operand isn't always false (due to short-circuiting).

From this, my advice would be to put the status = 1 check before the LIKE checks as (indexes aside) it is likely to be evaluated faster.

like image 24
Steve Chambers Avatar answered Sep 30 '22 03:09

Steve Chambers


is just the same thing put the ilikes first and the status criteria later?

You can find out the answer yourself by running EXPLAIN against the two versions of your query. E.g.

EXPLAIN select ... from ... where

Most probably you'll get two exactly same execution plans meaning the two statements are equivalent because the status part should be removed by the dead code elimination optimization.

like image 33
Ivan Avatar answered Sep 30 '22 02:09

Ivan