SQL: join: find all objects that satisfy some requirements

Tags:

sql

I have a table that contains a list of objects and which requirements they fulfill. Then I have a table that contains a list of tasks and which requirements an object must fulfill to be able to perform the task. I would like to query: given a task, show me all objects that can perform that task, and given an object, show me all tasks that object can perform:

Example:

task_req table

tasks   |    reqs
-----------------
taskA   |    req1
taskA   |    req2
taskA   |    req3
taskB   |    req4
taskB   |    req5
taskB   |    req6

So this table says that to perform taskA, requirements req1, req2, and req3 are needed.

obj_reqs table

object  |   reqs
----------------
obj1    |   req3
obj1    |   req4
obj2    |   req1
obj2    |   req2
obj2    |   req3
obj2    |   req4

So I could ask the question: which objects can perform taskA? The answer should be just one row:

tasks   |   objects
-------------------
taskA   |   object2

because obj2 is the only one that has fulfills requirements req1,req2,req3. Different question: which objects can perform taskB? The answer is none, because there's no object with requirements req4,req5,req6. The query should be handle logic where one task can be performed by multiple objects by returning multiple rows.

The question is: what query does this?

My problem is that I have managed to find such a query, but it seems to me too complicated. The query basically does: A) inner join task_reqs table with obj_reqs table, group by tasks and objs and count distinct requirements, B) select tasks,count(distinct(reqs)) from task_reqs group by tasks, C) inner join A and B on both task and count(distinct(reqs)).

Surely there's an easier way to do this query, right?

I'm pasting below the SQL code to generate the tables and my query.

create table task_reqs (task varchar, req varchar);
create table obj_reqs (object varchar, req varchar);
insert into task_reqs values ('taskA', 'req1');
insert into task_reqs values ('taskA', 'req2');
insert into task_reqs values ('taskA', 'req3');
insert into task_reqs values ('taskB', 'req4');
insert into task_reqs values ('taskB', 'req5');
insert into task_reqs values ('taskB', 'req6');
insert into obj_reqs values ('obj1','req1');
insert into obj_reqs values ('obj1','req3');
insert into obj_reqs values ('obj2','req1');
insert into obj_reqs values ('obj2','req2');
insert into obj_reqs values ('obj2','req3');
insert into obj_reqs values ('obj2','req4');

and my query:

select t.task,t.object,n.n_reqs
from (
    select task,object,count(distinct(obj_reqs.req)) as n_reqs
    from task_reqs
    inner join obj_reqs on task_reqs.req=obj_reqs.req
    group by task,object
) t
inner join (
    select task,count(distinct(req)) as n_reqs
    from task_reqs
    group by task
) n
on n.n_reqs=t.n_reqs and n.task=t.task;

which returns:

 task  | object | n_reqs 
-------+--------+--------
 taskA | obj2   |      3

Surely there's a simpler way.

981

asked Dec 17 '19 18:12

cheif

3 Answers

You can do it with a cross join of the tables:

select t.task, o.object, count(distinct t.req) n_reqs 
from task_reqs t cross join obj_reqs o
where t.task = 'taskA'
group by t.task, o.object
having count(distinct t.req) = count(case when t.req = o.req then 1 end)

See the demo.
Results:

| task  | object | n_reqs |
| ----- | ------ | ------ |
| taskA | obj2   | 3      |

answered Sep 21 '22 23:09

forpas

Here is a possible simpler way:

select t.task, o.object, count(t.req) n_reqs
  from task_reqs t left join obj_reqs o on t.req = o.req
  group by t.task, o.object
  having o.object is not null and count(t.req) = (select count(req) from task_reqs where 
  task = t.task)

Demo

answered Sep 18 '22 23:09

Amir Molaei

Your query seems fine. I believe this is going to complicated no matter how you go after it since the join criteria and/or where predicates are going to be dependent on both req and the count of req matching.

Window functions may cut down on processing time here since you can eliminate a table scan from you original query.

SELECT DISTINCT task, object
FROM
  (
    SELECT task, 
      object, 
      COUNT(*) OVER (PARTITION BY task, object) matchCount,
      trqs.reqCount
    FROM (SELECT task, req, count(*) OVER (PARTITION BY task) as reqcount FROM task_reqs) trqs
      INNER JOIN obj_reqs orqs
        ON trqs.req = orqs.req
   ) taskreqcounter
WHERE matchCount = reqCount

If you have an index on obj_reqs.req I think you would find this query to be pretty quick too. If you are interested in only a particular task then you can add that to the WHERE clause in the inner most subquery (trqs).

SQLFiddle here

Reversing this logic works for question 2

SELECT DISTINCT task, object
FROM
  (
    SELECT task, 
      object, 
      COUNT(*) OVER (PARTITION BY task, object) matchCount,
      orqs.reqCount
    FROM (SELECT object, req, count(*) OVER (PARTITION BY object) as reqcount FROM obj_reqs) orqs
      INNER JOIN task_reqs trqs
        ON orqs.req = trqs.req
   ) taskreqcounter
WHERE matchCount = reqCount

SQLFiddle here

answered Sep 18 '22 23:09

JNevill

Related questions
                            
                                InnerException : Table 'xxx.aspnetusers' doesn't exist
                            
                                Detecting patterns in table
                            
                                ORDER BY depending on parameter results in error
                            
                                Nested TransactionScope for NUnit TestFixure and SetUp
                            
                                How to specify field data type when reading from a SQL database
                            
                                How do I increase the speed of my Postgres select statement?
                            
                                ALTER TABLE lhm migration deletes existing values
                            
                                Union over fields having different names using peewee
                            
                                SQL Server : compare dates between rows
                            
                                Possible Oracle-Bug with SubQueries and Group Functions
                            
                                How to make a tuple be unique in SQL table?
                            
                                Android app with realTime database without using FIREBASE maybe Something like hooking on server [closed]
                            
                                Deleting LOBs does not decrease Hibernate H2 DB size
                            
                                EF: Incorrect usage of spatial/fulltext/hash index and explicit index order
                            
                                How to Summarize Consecutive Rows in a MySQL Table
                            
                                Implementing MERGE INTO sql in pyspark
                            
                                SQL query with join to get nested array of objects
                            
                                How can I improve query performance for 200+ million records
                            
                                How to get unique / distinct elements inside JSON array in MySQL 5.7
                            
                                How to unload data from Redshift to S3?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With