Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

inner join select (A,B) on A and B vs where (A,B) in select(A, B) in mysql

Tags:

sql

mysql

What is the best query between theses two ? They output the same result, expect one is doing the condition inside a where in and the other one inside an inner join.

select uv.* from version v inner join user_version uv ON v.id=uv.version_id
WHERE (v.number, v.master_id) IN (
select max(v.number) as number, v.master_id 
  from version v inner join user_version uv ON v.id=uv.version_id group by v.master_id);

and

 select * from user_version uv 
   inner join version v on v.id=uv.version_id and v.number
   inner join (
      select uv2.user_id, max(v2.number) maxNumber, v2.master_id master_id, v2.id version_id from version v2 
         inner join user_version uv2 on v2.id=uv2.version_id group by v2.master_id ) test
   on test.master_id=v.master_id and test.maxNumber=v.number ;

I've created a sqlfiddle with an example : http://sqlfiddle.com/#!2/76001/62 (The idea is to get the biggest version of a "master" entity linked to a given user)

If you have other ideas (I'm using mysql, so I can't use windows function)

Thanks

like image 524
Quentin Avatar asked Oct 21 '22 21:10

Quentin


1 Answers

This is not too easy to answer this question. You should know one important thing: MySQL treats IN (<static values list>) and IN (<subquery>) as different queries. First one is equal to range comparison (like .. OR = .. OR =) while second is equal to = ANY () - and it's not the same. So, to say short: using IN with subquery will cause query with ANY() and MySQL will not use index for that even if subquery is independent and returns static list of values. Sad, but true. MySQL can't predict that and so index will not be used even if it's obvious. If you'll use JOIN (i.e. rewrite your IN (<subquery>)) - then MySQL will use index for JOIN condition, if it's possible.

Now, the second case may be about JOIN and IN when using partitions. If you'll use JOIN - then, sadly - but MySQL also is not able to predict partitions for JOIN in common case - and it will use entire set of partitions for it. Replacing JOIN to IN (<static list>) will change EXPLAIN PARTITION picture: MySQL will use only those partitions, which are needed for selecting values from range, specified within IN clause. But, again, this will not work with IN (<subquery>).

As a conclusion - it's sad, when we're saying about how MySQL is handling IN subqueries - and in common case it can't be replaced with JOIN safely (that's about partitioning case). So, common solution will be: separate subquery from main query on application level. If we're saying about independent subquery, returning static values list, that's the best suggestion - then you can substitute that values list as IN(<static list>) and gain benefits: MySQL will use index for it, and, if we're saying about partitions, only actually needed from them will be used.

like image 79
Alma Do Avatar answered Nov 01 '22 17:11

Alma Do