Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting NOT IN to NOT EXISTS

Tags:

sql

oracle

Having a nightmare of a time understanding the usage of NOT EXISTS, primarily how to convert my NOT IN solution below so that I can actually understand how I achieved the results. Have several articles on askTom, the oracle forums and stackoverflow, but can't find anything that clearly helps understand this problem. My apologies if I have missed it through my noobish searching.

SELECT s.S_Fname, s.S_Lname
FROM STUDENT s
WHERE s.S_Sex = 'F'
AND S.S_Id NOT IN(SELECT e.S_Id
        FROM ENROLLMENT e
        WHERE e.Mark < 70);

Bit of assistance with the content, trying to find female students who have never received a mark below 70 in any class they have been enrolled in.

like image 742
TheRussian Avatar asked May 23 '13 06:05

TheRussian


1 Answers

It's rather simple, when you get the hang of it:

SELECT s.S_Fname, s.S_Lname
FROM STUDENT s
WHERE s.S_Sex = 'F'
AND S.S_Id NOT IN(SELECT e.S_Id           -- take this line
        FROM ENROLLMENT e
        WHERE e.Mark < 70);

That line basically compares S.S_Id with all the e.S_Id values that come from the subquery.

Now change that to NOT EXISTS and put an equality check S.S_Id = e.S_Id, inside the subquery:

SELECT s.S_Fname, s.S_Lname
FROM STUDENT s
WHERE s.S_Sex = 'F'
AND NOT EXISTS (SELECT e.S_Id          
        FROM ENROLLMENT e
        WHERE (e.Mark < 70)       -- if this is complex, you'll need parentheses
        AND S.S_Id = e.S_Id);

Minor possible change is to realize that (SELECT e.S_Id ... does not really need the e.S_Id. Subqueries with EXISTS and NOT EXISTS just check if there are rows returned or not and the column values do not matter. You can put SELECT * or a constant there (SELECT 1 is common) or SELECT NULL or even SELECT 1/0 (Yes, that will work!):

SELECT s.S_Fname, s.S_Lname
FROM STUDENT s
WHERE s.S_Sex = 'F'
AND NOT EXISTS (SELECT 1
        FROM ENROLLMENT e
        WHERE e.Mark < 70  
        AND S.S_Id = e.S_Id);

Another major consideration is that when you do the conversion this way, the (seemingly equivalent) NOT EXISTS and NOT IN writings of a query are really equivalent only if both S_Id columns are not nullable. If the e.S_Id column is nullable, the NOT IN may result in the whole query to return no rows at all (because x NOT IN (a, b, c, ...) is equivalent to x<>a AND x<>b AND ... and that condition cannot be true when one of the a,b,c... is NULL.)

For similar reasons, you will have different results if the s.S_Id is nullable (that's not very likely in this case as it's probably the primary key but in other cases it matters.)

So it's almost always better to use NOT EXISTS, as it behaves differently even if either column is nullable (the S.S_Id = e.S_Id check will discard rows with null earlier) and usually this behaviour is the wanted one. There are many details in the question: NOT IN vs NOT EXISTS, in the answer by @Martin Smith. You will also find there ways to convert the NOT IN to NOT EXISTS and keep the null related (unpleasant) behaviour.

like image 118
ypercubeᵀᴹ Avatar answered Nov 06 '22 04:11

ypercubeᵀᴹ