I am trying to select only the groups that are not contained in other groups.
In this example group number 2 is contained in group number 1 because group number 1 has all values of group number 2.
Group number 3 is not contained in group number 1 because it has the value 50 that group number 1 does not contain.
The result should be groups number 1 and 3.
(Or the opposite-to get only group number 2 that is contained in other group)
looking for a way without looping because I have over 2 million values.
My table looks like this:
group_number id
-------------------
1 10
1 20
1 30
1 40
2 10
2 40
3 10
3 30
3 50
I have tested with the set of data that you have provided and it works. First one, groups not contained in another group:
SELECT DISTINCT Group_Number FROM #T
WHERE NOT EXISTS (SELECT Group_Number G2
FROM #T AS T2
WHERE T2.Group_Number <> #t.Group_Number
AND T2.ID = #T.ID)
And... the other way is very easy having this one:
SELECT DISTINCT Group_Number FROM #T WHERE NOT Group_Number IN (
SELECT DISTINCT Group_Number FROM #T
WHERE NOT EXISTS (SELECT Group_Number G2
FROM #T AS T2
WHERE T2.Group_Number <> #t.Group_Number
AND T2.ID = #T.ID)
)
Just asking myself I realised that my response is not fully accurate. First, I realised that adding:
INSERT INTO #t
VALUES (6, 50),
(7, 60),
(8, 50),
(8, 60)
The group 8 did not appear as one item is present in group 6 and the other is in group 7. So, I did a lot of checks and concluded that the following code is the one that guarantees the results and also gives traceability to verify if response is correct or not:
SELECT DISTINCT Group_Number FROM
(
SELECT T1.Group_Number, T1.Rows, T2.Group_Number as Comparing_With_Other_Group, Count(DISTINCT T2.ID) AS Rows_On_Other_Group
FROM (
SELECT Group_Number, Count(DISTINCT ID) AS Rows
FROM #T
GROUP BY Group_Number
) T1
INNER JOIN #T AS T2
ON T1.Group_Number <> T2.Group_Number
AND EXISTS (SELECT 1 FROM #T WHERE #T.Group_Number = T1.Group_Number and #T.ID = T2.ID)
GROUP BY T1.Group_Number, T2.Group_Number, T1.Rows
) SubQry
WHERE Rows = Rows_On_Other_Group
If you run the SubQry only you will see the traceability while the full query will show you the Groups where the system can find another group that filtering the ID's for the ones in the Group that I am searching, finds the same number of ID's.
You can try the following query.
SELECT DISTINCT Group_Number FROM #temp
WHERE NOT EXISTS (SELECT DISTINCT Group_Number G2
FROM #temp AS T2
WHERE T2.Group_Number <> #temp.Group_Number
AND T2.ID = #temp.ID)
The output is-
Group_Number
1
3
Demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With