Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server find contained groups

I am trying to select only the groups that are not contained in other groups.

In this example group number 2 is contained in group number 1 because group number 1 has all values of group number 2.

Group number 3 is not contained in group number 1 because it has the value 50 that group number 1 does not contain.

The result should be groups number 1 and 3.

(Or the opposite-to get only group number 2 that is contained in other group)

looking for a way without looping because I have over 2 million values.

My table looks like this:

group_number    id
-------------------    
1               10
1               20
1               30
1               40
2               10
2               40
3               10
3               30
3               50
like image 928
Dvey Avatar asked Dec 23 '22 02:12

Dvey


2 Answers

I have tested with the set of data that you have provided and it works. First one, groups not contained in another group:

SELECT DISTINCT Group_Number FROM #T
    WHERE NOT EXISTS (SELECT Group_Number G2 
                    FROM #T AS T2 
                    WHERE T2.Group_Number <> #t.Group_Number
                      AND T2.ID = #T.ID)

And... the other way is very easy having this one:

SELECT DISTINCT Group_Number FROM #T WHERE NOT Group_Number IN (
  SELECT DISTINCT Group_Number FROM #T
    WHERE NOT EXISTS (SELECT Group_Number G2 
                    FROM #T AS T2 
                    WHERE T2.Group_Number <> #t.Group_Number
                      AND T2.ID = #T.ID)
  )

Just asking myself I realised that my response is not fully accurate. First, I realised that adding:

INSERT INTO #t 
VALUES (6, 50),
    (7, 60),
    (8, 50),
    (8, 60)

The group 8 did not appear as one item is present in group 6 and the other is in group 7. So, I did a lot of checks and concluded that the following code is the one that guarantees the results and also gives traceability to verify if response is correct or not:

SELECT DISTINCT Group_Number FROM 
    (
    SELECT T1.Group_Number, T1.Rows, T2.Group_Number as Comparing_With_Other_Group, Count(DISTINCT T2.ID) AS Rows_On_Other_Group
        FROM (
            SELECT Group_Number, Count(DISTINCT ID) AS Rows 
                FROM #T
                GROUP BY Group_Number
        ) T1
        INNER JOIN #T AS T2
            ON T1.Group_Number <> T2.Group_Number
            AND EXISTS (SELECT 1 FROM #T WHERE #T.Group_Number = T1.Group_Number and #T.ID = T2.ID)
        GROUP BY T1.Group_Number, T2.Group_Number, T1.Rows
    ) SubQry
    WHERE Rows = Rows_On_Other_Group

If you run the SubQry only you will see the traceability while the full query will show you the Groups where the system can find another group that filtering the ID's for the ones in the Group that I am searching, finds the same number of ID's.

like image 159
Angel M. Avatar answered Dec 31 '22 12:12

Angel M.


You can try the following query.

SELECT DISTINCT Group_Number FROM #temp
    WHERE NOT EXISTS (SELECT DISTINCT Group_Number G2 
                    FROM #temp AS T2 
                    WHERE T2.Group_Number <> #temp.Group_Number
                      AND T2.ID = #temp.ID)

The output is-

Group_Number
1
3

Demo

like image 38
Suraj Kumar Avatar answered Dec 31 '22 10:12

Suraj Kumar