How can I get a RANK that restarts at partition change? I have this table:
ID Date Value
1 2015-01-01 1
2 2015-01-02 1 <redundant
3 2015-01-03 2
4 2015-01-05 2 <redundant
5 2015-01-06 1
6 2015-01-08 1 <redundant
7 2015-01-09 1 <redundant
8 2015-01-10 2
9 2015-01-11 3
10 2015-01-12 3 <redundant
and I'm trying to delete all the rows where the Value is not changed from the previous entry (marked with < redundant). I've tried using cursors but it takes too long, as the table has ~50 million rows.
I've also tried using RANK:
SELECT ID, Date, Value,
RANK() over(partition by Value order by Date ASC) Rank,
FROM DataLogging
ORDER BY Date ASC
but I get:
ID Date Value Rank (Rank)
1 2015-01-01 1 1 (1)
2 2015-01-02 1 2 (2)
3 2015-01-03 2 1 (1)
4 2015-01-05 2 2 (2)
5 2015-01-06 1 3 (1)
6 2015-01-08 1 4 (2)
7 2015-01-09 1 5 (3)
8 2015-01-10 2 3 (1)
9 2015-01-11 3 1 (1)
10 2015-01-12 3 2 (2)
in parantheses is the Rank I would want, so that I can filter out rows with Rank = 1 and delete the rest of the rows.
EDIT: I've accepted the answer that seemed the easiest to write, but unfortunately none of the answers runs fast enough for deleting the rows. In the end I've decided to use the CURSOR afterall. I've split the data in chuncks of about 250k rows and the cursor runs through and deletes the rows in ~11 mins per batch of 250k rows, and the answers below, with DELETE, take ~35 mins per batch of 250k rows.
Here is a somewhat convoluted way to do it:
WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER(ORDER BY [Date]) RN1,
ROW_NUMBER() OVER(PARTITION BY Value ORDER BY [Date]) RN2
FROM dbo.YourTable
), CTE2 AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY Value, RN1 - RN2 ORDER BY [Date]) N
FROM CTE
)
SELECT *
FROM CTE2
ORDER BY ID;
The results are:
╔════╦════════════╦═══════╦═════╦═════╦═══╗
║ ID ║ Date ║ Value ║ RN1 ║ RN2 ║ N ║
╠════╬════════════╬═══════╬═════╬═════╬═══╣
║ 1 ║ 2015-01-01 ║ 1 ║ 1 ║ 1 ║ 1 ║
║ 2 ║ 2015-01-02 ║ 1 ║ 2 ║ 2 ║ 2 ║
║ 3 ║ 2015-01-03 ║ 2 ║ 3 ║ 1 ║ 1 ║
║ 4 ║ 2015-01-05 ║ 2 ║ 4 ║ 2 ║ 2 ║
║ 5 ║ 2015-01-06 ║ 1 ║ 5 ║ 3 ║ 1 ║
║ 6 ║ 2015-01-08 ║ 1 ║ 6 ║ 4 ║ 2 ║
║ 7 ║ 2015-01-09 ║ 1 ║ 7 ║ 5 ║ 3 ║
║ 8 ║ 2015-01-10 ║ 2 ║ 8 ║ 3 ║ 1 ║
║ 9 ║ 2015-01-11 ║ 3 ║ 9 ║ 1 ║ 1 ║
║ 10 ║ 2015-01-12 ║ 3 ║ 10 ║ 2 ║ 2 ║
╚════╩════════════╩═══════╩═════╩═════╩═══╝
To delete the rows you don't want, you just need to do:
DELETE FROM CTE2
WHERE N > 1;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With