First - apologies for the fuzzy title, I could not find a better one.
I have table with the following structure (simplification):
EmpID DeptID
1 1
1 2
2 1
3 2
4 5
5 2
This table represents a many-to-many relationship.
I'm interested in finding all the EmpIDs that are related to a specific group of DeptIDs, for example I want all the EmpIDs that are related to DeptIDs 1, 2 and 3. Please note it's an AND relationship and not an OR relationship. For my case, the EmpID may be related to additional DeptIDs besides 1, 2 and 3 for it to be a valid answer.
The number of DeptIDs I'm interested in changes (i.e. I may want EmpIDs who're related to both DeptID 3 and 5, or I may want EmpIDs related to DepIDs 2, 3, 4, 5, 6, 7).
When I try to approach this problem I find myself either creating a JOIN per DepID, or a subquery per DeptID. This would mean I have to generate a new query per the number of DeptIDs I'm testing against. I would obviously prefer having a static query with a parameter or set of parameters.
I'm working over both SQL Server and MySQL (developing in parallel two versions of my code).
Any ideas?
When you need to establish a many-to-many relationship between two or more tables, the simplest way is to use a Junction Table. A Junction table in a database, also referred to as a Bridge table or Associative Table, bridges the tables together by referencing the primary keys of each data table.
In this case the two tables are joined using the relationship table1.id = table2.id . It is possible to use multiple join statements together to join more than one table at the same time. To do that you add a second INNER JOIN statement and a second ON statement to indicate the third table and the second relationship.
Relational databases don't support direct many-to-many relationships between two tables. Then, how to implement many-to-many relationships in SQL? To create a many-to-many relationship in a database, you'll need to create a third table to connect the other two.
I'm assuming you want to find employees that are in ALL of the specified departments and not just the employees that are in ANY of the departments, which is a far easier query.
SELECT EmpID
FROM mytable t1
JOIN mytable t2 ON t1.EmpID = t2.EmpID AND t2.DeptID = 2
JOIN mytable t3 ON t2.EmpID = t3.EmpID AND t3.DeptID = 3
WHERE DeptID = 1
I'm going to preempt the inevitable suggestion that'll come to use aggregation:
SELECT EmpID
FROM mytable
WHERE DeptID IN (1,2,3)
GROUP BY EmpID
HAVING COUNT(1) = 3
Resist that temptation. It's significantly slower. A similar scenario to this came up in SQL Statement - “Join” Vs “Group By and Having” and the second version was, in that second, about twenty times slower.
I'd also suggest you look at Database Development Mistakes Made by AppDevelopers.
I'd start from something like:
SELECT EmpID, COUNT(*) AS NumDepts
FROM thetable
WHERE DeptID IN (1, 2, 3)
GROUP BY EmpId
HAVING COUNT(*) == 3
of course, that 3 in the last line would always be the length of the sequence of department ids you're checking (so for (2,3,4,5,6,7)
it would be 6). This is one natural way to express "employees connected to all of these departments".
Edit: I see a note in another answer about performance issues -- I've tried this approach in SQLite and PostgreSQL, with appropriate indices, and there it looks like it's performing well and with appropriate use of all said indices; and in MySQL 5.0, where I have to admit performance was nowhere as good.
I suspect (without an opportunity to benchmark this on a zillion more engines;-) that other really good SQL engines (such as SQL Server 2008, Oracle, IBM DB2, the new open-source Ingres...) will also optimize this query well, while other mediocre ones (can't think of any with a popularity anywhere close to MySQL's) won't.
So, no doubt your favorite answer will depend on what engines you really care about (this takes me back to the time, over a decade ago, when my responsibilities included managing the team which maintained a component that was supposed to provide well-performing queries over more than half a dozen disparate engines -- talk about nightmare jobs...!-).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With