I've got an implementation for this that uses a super hairy recursive CTE that is really hard to follow/maintain. I was hoping that one of the brains on SO could come up with some more straightforward tSQL approach code to accomplish the following:
Table Documents
DocID SortOrder PageCount StartPgNum EndPgNum
5 1 2 {1} {2}
8 2 7 {3} {9}
22 3 3 {10} {12}
For the table given above, I need a query to populate StartPgNum and EndPgNum (Sample values included in the example in {} to make the intentions clearer for what I need.
Assumptions:
* DocID, SortOrder, and PageCount are pre-populated.
* StartPgNum and EndgNum need to be populated by tSQL code.
* SortOrder always starts at 1, and is continuous with no gaps.
* Documents should get a continuous page numbering scheme as ordered by SortOrder
Updated to be better :)
DECLARE @temp TABLE (DocID INT, SortOrder INT, PageCount INT)
INSERT INTO @temp VALUES (5, 1, 2)
INSERT INTO @temp VALUES (8, 2, 7)
INSERT INTO @temp VALUES (22, 3, 3)
SELECT
*,
StartPgNum + PageCount-1 AS EndPgNum
FROM
(SELECT
DocID,
SortOrder,
PageCount,
ISNULL((SELECT SUM(PageCount)+1 FROM @temp WHERE SortOrder < parent.SortOrder), 1) AS StartPgNum
FROM
@temp parent) _temp
I did some testing on all of the solutions provided here in the other answers, my original "Hairy Recursive CTE" option and for the sake of completeness a simple cursor based approach. To my great surprise the cursor option performed the best by a clear margin in all my tests (1K Rows, 10KRows, 50K Rows, 500K Rows)
Here are the average times for each approach for 10K records:
Hairy Recursive CTE: 3 minutes 55 seconds
CROSS APPLY (Ben Dempsey): 21-25 seconds
SUBSELECTS (Tim Khouri): 19-21 seconds
CURSOR: 1-2 Seconds
Here is my cursor based solution:
Declare @temp TABLE(
DocID INT PRIMARY KEY NOT NULL,
SortOrder INT NOT NULL,
PageCount INT NOT NULL,
BegPg int,
EndPg int
)
Insert into @temp (DocID,SortOrder,PageCount)
SELECT top 50000 docid, ROW_NUMBER() OVER (ORDER BY DOCID),Pages FROM tblDocuments
DECLARE @PC int
SET @PC=1
DECLARE @FetchPageCount int
DECLARE @FetchDocID int
DECLARE myCursor CURSOR FOR
SELECT DocID, PageCount FROM @temp ORDER BY SortOrder
OPEN myCursor
FETCH NEXT FROM myCursor INTO @FetchDocID,@FetchPageCount
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE @temp SET BegPg=@PC, EndPg=@PC+ @FetchPageCount-1
WHERE (Docid=@fetchDocid)
SET @PC = @PC + @FetchPageCount
FETCH NEXT FROM myCursor INTO @FetchDocID,@FetchPageCount
END
CLOSE myCursor
DEALLOCATE myCursor
SELECT * FROM @temp
Who would have guessed it? Maybe cursors aren't always evil.
A word of warning: Lest you be tempted to replace the update to the "WHERE CURRENT OF myCursor" syntax, it performed much slower than using the current version with a where clause, although still faster than most of the other approaches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With