Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL: Many-To-Many table AND query

First - apologies for the fuzzy title, I could not find a better one.

I have table with the following structure (simplification):

EmpID DeptID

1     1
1     2
2     1
3     2
4     5
5     2

This table represents a many-to-many relationship.

I'm interested in finding all the EmpIDs that are related to a specific group of DeptIDs, for example I want all the EmpIDs that are related to DeptIDs 1, 2 and 3. Please note it's an AND relationship and not an OR relationship. For my case, the EmpID may be related to additional DeptIDs besides 1, 2 and 3 for it to be a valid answer.

The number of DeptIDs I'm interested in changes (i.e. I may want EmpIDs who're related to both DeptID 3 and 5, or I may want EmpIDs related to DepIDs 2, 3, 4, 5, 6, 7).

When I try to approach this problem I find myself either creating a JOIN per DepID, or a subquery per DeptID. This would mean I have to generate a new query per the number of DeptIDs I'm testing against. I would obviously prefer having a static query with a parameter or set of parameters.

I'm working over both SQL Server and MySQL (developing in parallel two versions of my code).

Any ideas?

like image 318
Roee Adler Avatar asked Jun 28 '09 05:06

Roee Adler


People also ask

How do you write a SQL query for many-to-many relationships?

When you need to establish a many-to-many relationship between two or more tables, the simplest way is to use a Junction Table. A Junction table in a database, also referred to as a Bridge table or Associative Table, bridges the tables together by referencing the primary keys of each data table.

How do I join many-to-many tables in SQL?

In this case the two tables are joined using the relationship table1.id = table2.id . It is possible to use multiple join statements together to join more than one table at the same time. To do that you add a second INNER JOIN statement and a second ON statement to indicate the third table and the second relationship.

Can you have a many-to-many relationship in SQL?

Relational databases don't support direct many-to-many relationships between two tables. Then, how to implement many-to-many relationships in SQL? To create a many-to-many relationship in a database, you'll need to create a third table to connect the other two.


2 Answers

I'm assuming you want to find employees that are in ALL of the specified departments and not just the employees that are in ANY of the departments, which is a far easier query.

SELECT EmpID
FROM mytable t1
JOIN mytable t2 ON t1.EmpID = t2.EmpID AND t2.DeptID = 2
JOIN mytable t3 ON t2.EmpID = t3.EmpID AND t3.DeptID = 3
WHERE DeptID = 1

I'm going to preempt the inevitable suggestion that'll come to use aggregation:

SELECT EmpID
FROM mytable
WHERE DeptID IN (1,2,3)
GROUP BY EmpID
HAVING COUNT(1) = 3

Resist that temptation. It's significantly slower. A similar scenario to this came up in SQL Statement - “Join” Vs “Group By and Having” and the second version was, in that second, about twenty times slower.

I'd also suggest you look at Database Development Mistakes Made by AppDevelopers.

like image 50
cletus Avatar answered Oct 15 '22 00:10

cletus


I'd start from something like:

SELECT EmpID, COUNT(*) AS NumDepts
FROM thetable
WHERE DeptID IN (1, 2, 3)
GROUP BY EmpId
HAVING COUNT(*) == 3

of course, that 3 in the last line would always be the length of the sequence of department ids you're checking (so for (2,3,4,5,6,7) it would be 6). This is one natural way to express "employees connected to all of these departments".

Edit: I see a note in another answer about performance issues -- I've tried this approach in SQLite and PostgreSQL, with appropriate indices, and there it looks like it's performing well and with appropriate use of all said indices; and in MySQL 5.0, where I have to admit performance was nowhere as good.

I suspect (without an opportunity to benchmark this on a zillion more engines;-) that other really good SQL engines (such as SQL Server 2008, Oracle, IBM DB2, the new open-source Ingres...) will also optimize this query well, while other mediocre ones (can't think of any with a popularity anywhere close to MySQL's) won't.

So, no doubt your favorite answer will depend on what engines you really care about (this takes me back to the time, over a decade ago, when my responsibilities included managing the team which maintained a component that was supposed to provide well-performing queries over more than half a dozen disparate engines -- talk about nightmare jobs...!-).

like image 38
Alex Martelli Avatar answered Oct 15 '22 00:10

Alex Martelli