Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

At least one X but no Ys Query

I come across this pattern occasionally and I haven't found a terribly satisfactory way to solve it.

Say I have a employee table and an review table. Each employee can have more than one review. I want to find all the employees who have at least one "good" review but no "bad" reviews.

I haven't figured out how to make subselects work without knowing the employee ID before hand and I haven't figured out the right combination of joins to make this happen.

Is there a way to do this WITHOUT stored procedures, functions or bringing the data server side? I've gotten it to work with those but I'm sure there's another way.

like image 732
Drew Avatar asked Dec 10 '22 02:12

Drew


2 Answers

Since you haven't posted your DB Structure, I made some assumptions and simplifications (regarding the rating column, which probably is number and not a character field). Adjust accordingly.

Solution 1: Using Joins

select distinct e.EmployeeId, e.Name
from employee e
left join reviews r1 on e.EmployeeId = r1.EmployeeId and r1.rating = 'good'
left join reviews r2 on e.EmployeeId = r2.EmployeeId and r1.rating = 'bad'
where r1.ReviewId is not null --meaning there's at least one
and r2.ReviewId is null --meaning there's no bad review

Solution 2: Grouping By and Filtering with Conditional Count

select e.EmployeeId, max(e.Name) Name
from employee e
left join reviews r on e.EmployeeId = r.EmployeeId
group by e.EmployeeId
having count(case r.rating when 'good' then 1 else null end) > 0
and  count(case r.rating when 'bad' then 1 else null end) = 0

Both solutions are SQL ANSI compatible, which means both work with any RDBMS flavor that fully support SQL ANSI standards (which is true for most RDBMS).

As pointed out by @onedaywhen, the code will not work in MS Access (have not tested, I'm trusting in his expertise on the subject).

But I have one saying on this (which might make some people upset): I hardly consider MS Access a RDBMS. I have worked with it in the past. Once you move on (Oracle, SQL Server, Firebird, PostGreSQL, MySQL, you name it), you do not ever want to come back. Seriously.

like image 151
Adriano Carneiro Avatar answered Jan 12 '23 11:01

Adriano Carneiro


The question -- return rows on side A based on nonexistence of a match in B -- (employees with No "Bad" reviews) describes an "anti-semi join". There are numerous ways to accomplish this kind of query, at least 5 I've discovered in MS Sql 2005 and above.

I know this solution works in MSSQL 2000 and above, and is the most efficient out of the 5 ways I've tried in MS Sql 2005 and 2008. I am not sure if it will work in MySQL, but it should, as it reflects a rather common set operation.

Note, the IN clause gives the subquery access to the employee table in the outer scope.

SELECT EE.*
FROM   employee EE
WHERE
    EE.EmpKey IN (
      SELECT RR.EmpKey
      FROM   review RR
      WHERE  RR.EmpKey = EE.EmpKey
        AND  RR.ScoreCategory = 'good'
    ) 
  AND
    EE.EmpKey NOT IN (
      SELECT  RR.EmpKey 
      FROM    review RR
      WHERE   RR.EmpKey = EE.EmpKey
        AND   RR.ScoreCategory = 'bad'
    )
like image 39
Griffin Avatar answered Jan 12 '23 09:01

Griffin