Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grouping SQL Results based on order

I have table with data something like this:

ID     | RowNumber     | Data
------------------------------
1      | 1             | Data
2      | 2             | Data
3      | 3             | Data
4      | 1             | Data
5      | 2             | Data
6      | 1             | Data
7      | 2             | Data
8      | 3             | Data
9      | 4             | Data

I want to group each set of RowNumbers So that my result is something like this:

ID     | RowNumber     | Group | Data
--------------------------------------
1      | 1             | a     | Data
2      | 2             | a     | Data
3      | 3             | a     | Data
4      | 1             | b     | Data
5      | 2             | b     | Data
6      | 1             | c     | Data
7      | 2             | c     | Data
8      | 3             | c     | Data
9      | 4             | c     | Data

The only way I know where each group starts and stops is when the RowNumber starts over. How can I accomplish this? It also needs to be fairly efficient since the table I need to do this on has 52 Million Rows.

Additional Info

ID is truly sequential, but RowNumber may not be. I think RowNumber will always begin with 1 but for example the RowNumbers for group1 could be "1,1,2,2,3,4" and for group2 they could be "1,2,4,6", etc.

like image 343
matthew Avatar asked Feb 15 '13 20:02

matthew


2 Answers

For the clarified requirements in the comments

The rownumbers for group1 could be "1,1,2,2,3,4" and for group2 they could be "1,2,4,6" ... a higher number followed by a lower would be a new group.

A SQL Server 2012 solution could be as follows.

  1. Use LAG to access the previous row and set a flag to 1 if that row is the start of a new group or 0 otherwise.
  2. Calculate a running sum of these flags to use as the grouping value.

Code

WITH T1 AS
(
SELECT *,
       LAG(RowNumber) OVER (ORDER BY ID) AS PrevRowNumber
FROM YourTable
), T2 AS
(
SELECT *,
       IIF(PrevRowNumber IS NULL OR PrevRowNumber > RowNumber, 1, 0) AS NewGroup
FROM T1
)
SELECT ID,
        RowNumber,
        Data,
        SUM(NewGroup) OVER (ORDER BY ID 
                            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS  Grp
FROM T2

SQL Fiddle

Assuming ID is the clustered index the plan for this has one scan against YourTable and avoids any sort operations.

Plan

like image 165
Martin Smith Avatar answered Oct 02 '22 17:10

Martin Smith


If the ids are truly sequential, you can do:

select t.*,
       (id - rowNumber) as grp
from t
like image 20
Gordon Linoff Avatar answered Oct 02 '22 15:10

Gordon Linoff