I have a table that contains a list of objects and which requirements they fulfill. Then I have a table that contains a list of tasks and which requirements an object must fulfill to be able to perform the task. I would like to query: given a task, show me all objects that can perform that task, and given an object, show me all tasks that object can perform:
Example:
task_req table
tasks | reqs
-----------------
taskA | req1
taskA | req2
taskA | req3
taskB | req4
taskB | req5
taskB | req6
So this table says that to perform taskA, requirements req1, req2, and req3 are needed.
obj_reqs table
object | reqs
----------------
obj1 | req3
obj1 | req4
obj2 | req1
obj2 | req2
obj2 | req3
obj2 | req4
So I could ask the question: which objects can perform taskA? The answer should be just one row:
tasks | objects
-------------------
taskA | object2
because obj2 is the only one that has fulfills requirements req1,req2,req3. Different question: which objects can perform taskB? The answer is none, because there's no object with requirements req4,req5,req6. The query should be handle logic where one task can be performed by multiple objects by returning multiple rows.
The question is: what query does this?
My problem is that I have managed to find such a query, but it seems to me too complicated. The query basically does: A) inner join task_reqs table with obj_reqs table, group by tasks and objs and count distinct requirements, B) select tasks,count(distinct(reqs)) from task_reqs group by tasks, C) inner join A and B on both task and count(distinct(reqs)).
Surely there's an easier way to do this query, right?
I'm pasting below the SQL code to generate the tables and my query.
create table task_reqs (task varchar, req varchar);
create table obj_reqs (object varchar, req varchar);
insert into task_reqs values ('taskA', 'req1');
insert into task_reqs values ('taskA', 'req2');
insert into task_reqs values ('taskA', 'req3');
insert into task_reqs values ('taskB', 'req4');
insert into task_reqs values ('taskB', 'req5');
insert into task_reqs values ('taskB', 'req6');
insert into obj_reqs values ('obj1','req1');
insert into obj_reqs values ('obj1','req3');
insert into obj_reqs values ('obj2','req1');
insert into obj_reqs values ('obj2','req2');
insert into obj_reqs values ('obj2','req3');
insert into obj_reqs values ('obj2','req4');
and my query:
select t.task,t.object,n.n_reqs
from (
select task,object,count(distinct(obj_reqs.req)) as n_reqs
from task_reqs
inner join obj_reqs on task_reqs.req=obj_reqs.req
group by task,object
) t
inner join (
select task,count(distinct(req)) as n_reqs
from task_reqs
group by task
) n
on n.n_reqs=t.n_reqs and n.task=t.task;
which returns:
task | object | n_reqs
-------+--------+--------
taskA | obj2 | 3
Surely there's a simpler way.
The SQL SELECT statement is used to retrieve records from one or more tables in your SQL database.
To use the WHERE clause to perform the same join as you perform using the INNER JOIN syntax, enter both the join condition and the additional selection condition in the WHERE clause. The tables to be joined are listed in the FROM clause, separated by commas. This query returns the same output as the previous example.
An SQL query can JOIN three tables (or more). Simply add an extra JOIN condition for the third table. 3-Table JOINs work with SELECT, UPDATE, and DELETE queries.
The INNER JOIN keyword selects all rows from both the tables as long as the condition is satisfied. This keyword will create the result-set by combining all rows from both the tables where the condition satisfies i.e value of the common field will be the same.
You can do it with a cross join of the tables:
select t.task, o.object, count(distinct t.req) n_reqs
from task_reqs t cross join obj_reqs o
where t.task = 'taskA'
group by t.task, o.object
having count(distinct t.req) = count(case when t.req = o.req then 1 end)
See the demo.
Results:
| task | object | n_reqs |
| ----- | ------ | ------ |
| taskA | obj2 | 3 |
Here is a possible simpler way:
select t.task, o.object, count(t.req) n_reqs
from task_reqs t left join obj_reqs o on t.req = o.req
group by t.task, o.object
having o.object is not null and count(t.req) = (select count(req) from task_reqs where
task = t.task)
Demo
Your query seems fine. I believe this is going to complicated no matter how you go after it since the join criteria and/or where predicates are going to be dependent on both req
and the count of req
matching.
Window functions may cut down on processing time here since you can eliminate a table scan from you original query.
SELECT DISTINCT task, object
FROM
(
SELECT task,
object,
COUNT(*) OVER (PARTITION BY task, object) matchCount,
trqs.reqCount
FROM (SELECT task, req, count(*) OVER (PARTITION BY task) as reqcount FROM task_reqs) trqs
INNER JOIN obj_reqs orqs
ON trqs.req = orqs.req
) taskreqcounter
WHERE matchCount = reqCount
If you have an index on obj_reqs.req
I think you would find this query to be pretty quick too. If you are interested in only a particular task then you can add that to the WHERE
clause in the inner most subquery (trqs
).
SQLFiddle here
Reversing this logic works for question 2
SELECT DISTINCT task, object
FROM
(
SELECT task,
object,
COUNT(*) OVER (PARTITION BY task, object) matchCount,
orqs.reqCount
FROM (SELECT object, req, count(*) OVER (PARTITION BY object) as reqcount FROM obj_reqs) orqs
INNER JOIN task_reqs trqs
ON orqs.req = trqs.req
) taskreqcounter
WHERE matchCount = reqCount
SQLFiddle here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With