Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Join within an "exists" subquery

I am wondering why when you join two tables on a key within an exists subquery the join has to happen within the WHERE clause instead of the FROM clause.

This is my example:

Join within FROM clause:

SELECT payer_id
  FROM Population1
 WHERE NOT EXISTS
       (Select *
          From Population2 join Population1 
            On Population2.payer_id = Population1.payer_id)

Join within WHERE clause:

SELECT payer_id
  FROM Population1
  WHERE NOT EXISTS
        (Select *
           From Population2
          WHERE Population2.payer_id = Population1.payer_id)

The first query gives me 0 results, which I know is incorrect, while the second query gives the the thousands of results I am expecting to see.

Could someone just explain to me why where the join happens in an EXISTS subquery matters? If you take the subqueries without the parent query and run them they literally give you the same result.

It would help me a lot to remember to not continue to make this mistake when using exists.

Thanks in advance.

like image 290
Tyler Avatar asked Sep 11 '25 01:09

Tyler


1 Answers

You need to understand the distinction between a regular subquery and a correlated subquery.

Using your examples, this should be easy. The first where clause is:

where not exists (Select 1
                  from Population2 join
                       Population1
                       on Population2.payer_id = Population1.payer_id
                 )

This condition does exactly what it says it is doing. The subquery has no connection to the outer query. So, the not exists will either filter out all rows or keep all rows.

In this case, the engine runs the subquery and determines that at least one row is returned. Hence, not exists returns false in all cases, and the nothing is returned.

In the second case, the subquery is a correlated subquery. So, for each row in population1 the subquery is run using the value of Population1.payer_id. In some cases, matching rows exist in Population2; these are filtered out. In other cases, matching rows do not exist; these are in the result set.

like image 122
Gordon Linoff Avatar answered Sep 13 '25 16:09

Gordon Linoff