Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server Weird Grouping Scenario by multiple columns and OR

I have a weird grouping scenario and have some troubles finding out what would be the best way for grouping in SQL.

Imagine we have the following one table

CREATE TABLE Item
(
  KeyId VARCHAR(1) NOT NULL,
  Col1 INT NULL,
  Col2 INT NULL,
  Col3 INT NULL
)

GO

INSERT INTO Item (KeyId, Col1, Col2, Col3)
VALUES 
('a',1,2,3),
('b',5,4,3),
('c',5,7,6),
('d',8,7,9),
('e',11,10,9),
('f',11,12,13),
('g',20,22,21),
('h',23,22,24)

I need to group records in this table so that if Col1 OR Col2 OR Col3 is the same for two records, then these two records should be in the same group, and there should be chaining. In other words, with the data as above record 'a' (first record) has Col3 = 3 and record 'b' (second record) has also Col3 = 3, so these two should be in one group. But then record 'b' has the same Col1 as record 'c', so record 'c' should be in the same group as 'a' and 'b'. And then record 'd' has the same Col2 as in 'c', so this should also be in the same group. Similarly 'e' and 'f' has the same values in Col3 and Col1 respectively.

On the other hand records 'g' and 'h' will be in one group (because they have the same Col2 = 22), but this group will be different from the group for records 'a','b','c','d','e','f'.

The result of the query should be something like

KeyId GroupId
'a'   1 
'b'   1
'c'   1
'd'   1
'e'   1
'f'   1
'g'   2
'h'   2

There is probably a way of doing this with some loops/cursors, but I started thinking about cleaner way and this seems quite difficult.

like image 690
Ihor Deyneka Avatar asked Aug 01 '18 11:08

Ihor Deyneka


1 Answers

Here you go:

with g (rootid, previd, level, keyid, col1, col2, col3) as (
  select keyid, '-', 1, keyid, col1, col2, col3 from item
  union all
  select g.rootid, g.keyid, g.level + 1, i.keyid, i.col1, i.col2, i.col3 
    from g
    join item i on i.col1 = g.col1 or i.col2 = g.col2 or i.col3 = g.col3 
    where i.keyid > g.keyid
),
  m (keyid, rootid) as (
  select keyid, min(rootid) from g group by keyid
)
select * from m;

Result:

keyid  rootid  
-----  ------
a      a       
b      a       
c      a       
d      a       
e      a       
f      a       
g      g       
h      g       

Note: Keep in mind that SQL Server has by default a limit of 100 iterations (number of rows per group) when processing recursive CTEs. In English: even though it's possible to do this as shown above, there are clear limitations to what SQL Server can process. If you reach this limit you'll get the message:

The maximum recursion 100 has been exhausted before statement completion.

If this happens consider adding the clause option (maxrecursion 32767).

like image 166
The Impaler Avatar answered Nov 13 '22 10:11

The Impaler