I have a problem, that I want to partition over a sorted table. Is there a way I can do that?
I am using SQL Server 2016.
Input Table:
|---------|-----------------|-----------|------------|
| prod | sortcolumn | type | value |
|---------|-----------------|-----------|------------|
| X | 1 | P | 12 |
| X | 2 | P | 23 |
| X | 3 | E | 34 |
| X | 4 | P | 45 |
| X | 5 | E | 56 |
| X | 6 | E | 67 |
| Y | 1 | P | 78 |
|---------|-----------------|-----------|------------|
Desired Output
|---------|-----------------|-----------|------------|------------|
| prod | sortcolumn | type | value | rowNr |
|---------|-----------------|-----------|------------|------------|
| X | 1 | P | 12 | 1 |
| X | 2 | P | 23 | 2 |
| X | 3 | E | 34 | 1 |
| X | 4 | P | 45 | 1 |
| X | 5 | E | 56 | 1 |
| X | 6 | E | 67 | 2 |
| Y | 1 | P | 78 | 1 |
|---------|-----------------|-----------|------------|------------|
I am this far:
SELECT
table.*,
ROW_NUMBER() OVER(PARTITION BY table.prod, table.type ORDER BY table.sortColumn) rowNr
FROM table
But this does not restart the row number on the 4th row, since it is the same prod and type. How could I restart on every prod and also on every type change based on the sort criteria, even if the type changes back to something it already was previously? Is this even possible with a ROW_NUMBER function or do I have to work with LEAD and LAG and CASES (which would probably make it very slow, right?)
Thanks!
SQL Server always maintains partition ids in partition boundary order. No matter how you split or merge partitions, the order of boundary values always matches the order of partition ids. SQL Server renumbers as necessary.
bhushan - we cannot use a where clause after the partition by is over. so, your first example is perfect like Frank's.
ORDER BY Characteristics: The ORDER BY clause is used to get the sorted records on one or more columns in ascending or descending order. The ORDER BY clause must come after the WHERE, GROUP BY, and HAVING clause if present in the query.
Determines the partitioning and ordering of a rowset before the associated window function is applied. That is, the OVER clause defines a window or user-specified set of rows within a query result set. A window function then computes a value for each row in the window.
This is a gaps and islands problem. You can use the following query:
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY prod ORDER BY sortcolumn)
-
ROW_NUMBER() OVER (PARTITION BY prod, type ORDER BY sortcolumn) AS grp
FROM mytable t
to get:
prod sortcolumn type value grp
----------------------------------------
X 1 P 12 0
X 2 P 23 0
X 3 E 34 2
X 4 P 45 1
X 5 E 56 3
X 6 E 67 3
Y 1 P 78 0
Now, field grp
can be used for partitioning:
;WITH IslandsCTE AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY prod ORDER BY sortcolumn)
-
ROW_NUMBER() OVER (PARTITION BY prod, type ORDER BY sortcolumn) AS grp
FROM mytable t
)
SELECT prod, sortcolumn, type, value,
ROW_NUMBER() OVER (PARTITION BY prod, type, grp ORDER BY sortcolumn) AS rowNr
FROM IslandsCTE
ORDER BY prod, sortcolumn
Demo here
This is a classic 'islands' problem, in that you need to find the 'islands' of records related by prod
and type
, but without grouping together all records matching on prod
and type
.
Here's one way this is typically solved. Set up:
DECLARE @t TABLE (
prod varchar(1),
sortcolumn int,
type varchar(1),
value int
);
INSERT @t VALUES
('X', 1, 'P', 12),
('X', 2, 'P', 23),
('X', 3, 'E', 34),
('X', 4, 'P', 45),
('X', 5, 'E', 56),
('X', 6, 'E', 67),
('Y', 1, 'P', 78)
;
Get some row numbers in place:
;WITH numbered AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY prod, type ORDER BY sortcolumn) as rnX,
ROW_NUMBER() OVER (PARTITION BY prod ORDER BY sortcolumn) as rn
FROM
@t
)
numbered
now looks like this:
prod sortcolumn type value rnX rn
---- ----------- ---- ----------- -------------------- --------------------
X 1 P 12 1 1
X 2 P 23 2 2
X 3 E 34 1 3
X 4 P 45 3 4
X 5 E 56 2 5
X 6 E 67 3 6
Y 1 P 78 1 1
Why is this useful? Well, look at the difference between rnX
and rn
:
prod sortcolumn type value rnX rn rn - rnX
---- ----------- ---- ----------- -------------------- -------------------- --------------------
X 1 P 12 1 1 0
X 2 P 23 2 2 0
X 3 E 34 1 3 2
X 4 P 45 3 4 1
X 5 E 56 2 5 3
X 6 E 67 3 6 3
Y 1 P 78 1 1 0
As you can see, each 'group' shares a rn - rnX
value, and this changes from one group to the next.
So now if we partition by prod
, type
, and group number, then number within that:
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY prod, type, rn - rnX ORDER BY sortcolumn) rowNr
FROM
numbered
ORDER BY
prod, sortcolumn
we're done:
prod sortcolumn type value rnX rn rowNr
---- ----------- ---- ----------- -------------------- -------------------- --------------------
X 1 P 12 1 1 1
X 2 P 23 2 2 2
X 3 E 34 1 3 1
X 4 P 45 3 4 1
X 5 E 56 2 5 1
X 6 E 67 3 6 2
Y 1 P 78 1 1 1
Related reading: Things SQL needs: SERIES()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With