Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the one matching set in a number of sets

I have a table (@t1) that contains a number of sets. I want to find a perfect match for @t2 in @t1.

In this example, the desired result is 1.

(Set 1 matches perfectly, set 2 contains three elements while @t2 only contains two elements, set 3 contains less elements than @t2, set 4 contains NULL elements which are not allowed in @t2 and set 5 contains the right number of elements but one of the elements is not equal.)

DECLARE @t1 TABLE (id INT, data INT);
DECLARE @t2 TABLE (data INT PRIMARY KEY);

INSERT INTO @t1 (id, data)
VALUES
(1, 1),
(1, 2),
(2, 1),
(2, 2),
(2, 3),
(3, 1),
(4, NULL),
(4, NULL),
(5, 1),
(5, 3);

INSERT @t2 (data)
VALUES
(1),
(2);

I have a query that may be getting the job done, but it looks somehow wretched too me.

WITH t1 AS
(
    SELECT id, data
    FROM @t1
    WHERE data IS NOT NULL
),
t1_count AS
(
    SELECT id, RCount = COUNT(*)
    FROM @t1
    WHERE data IS NOT NULL
    GROUP BY id
)
SELECT t1.id
FROM t1
JOIN t1_count ON t1.id = t1_count.id
FULL JOIN @t2 t2 ON t1.data = t2.data
WHERE t1_count.RCount = (SELECT RCount = COUNT(*) FROM @t2)
GROUP BY t1.id
HAVING COUNT(t1.data) = COUNT(t2.data);

EDIT (GarethD's comment):

WITH t1 AS
(
    SELECT
        id,
        data,
        RCount = COUNT(*) OVER(PARTITION BY id)
    FROM @t1
    WHERE data IS NOT NULL
)
SELECT t1.id
FROM t1
FULL JOIN @t2 t2 ON t1.data = t2.data
WHERE t1.RCount = (SELECT RCount = COUNT(*) FROM @t2)
GROUP BY t1.id
HAVING COUNT(t1.data) = COUNT(t2.data);
like image 959
TvdH Avatar asked Jun 12 '15 12:06

TvdH


1 Answers

What you want is called Exact Relational Division. Unfortunately, SQL Server has no native operator for this, but it is a well-documented problem. One possible solution (idea taken from an article by Joe Celko) is to compare counts, similar to what you are already doing:

SELECT t1.id
  FROM @t1 AS t1 LEFT JOIN @t2 AS t2 ON t1.data = t2.data
 GROUP BY t1.id
HAVING COUNT(t1.data) = (SELECT COUNT(data) FROM @t2)
   AND COUNT(t2.data) = (SELECT COUNT(data) FROM @t2);

Note that both HAVING comparisons are necessary:

  • The first one ensures that t1 has exactly the required number of rows and
  • the second one ensures that these rows only contain values from t2 (otherwise, t2.data would be NULL via the LEFT JOIN. Recall that COUNT(x) only counts non-null values of x).
like image 162
Heinzi Avatar answered Sep 29 '22 13:09

Heinzi