I want to implement table paging using this method:
SET @PageNum = 2;
SET @PageSize = 10;
WITH OrdersRN AS
(
SELECT ROW_NUMBER() OVER(ORDER BY OrderDate, OrderID) AS RowNum
,*
FROM dbo.Orders
)
SELECT *
FROM OrdersRN
WHERE RowNum BETWEEN (@PageNum - 1) * @PageSize + 1
AND @PageNum * @PageSize
ORDER BY OrderDate ,OrderID;
Is there anything I should be aware of ? Table has millions of records.
Thx.
EDIT:
After using suggested MAXROWS
method for some time (which works really really fast) I had to switch back to ROW_NUMBER
method because of its greater flexibility. I am also very happy about its speed so far (I am working with View having more then 1M records with 10 columns). To use any kind of query I use following modification:
PROCEDURE [dbo].[PageSelect]
(
@Sql nvarchar(512),
@OrderBy nvarchar(128) = 'Id',
@PageNum int = 1,
@PageSize int = 0
)
AS
BEGIN
SET NOCOUNT ON
Declare @tsql as nvarchar(1024)
Declare @i int, @j int
if (@PageSize <= 0) OR (@PageSize > 10000)
SET @PageSize = 10000 -- never return more then 10K records
SET @i = (@PageNum - 1) * @PageSize + 1
SET @j = @PageNum * @PageSize
SET @tsql =
'WITH MyTableOrViewRN AS
(
SELECT ROW_NUMBER() OVER(ORDER BY ' + @OrderBy + ') AS RowNum
,*
FROM MyTableOrView
WHERE ' + @Sql + '
)
SELECT *
FROM MyTableOrViewRN
WHERE RowNum BETWEEN ' + CAST(@i as varchar) + ' AND ' + cast(@j as varchar)
exec(@tsql)
END
If you use this procedure make sure u prevented sql injection.
In my experience, an aggregate (DISTINCT or GROUP BY) can be quicker then a ROW_NUMBER() approach.
ROW_NUMBER function is a SQL ranking function that assigns a sequential rank number to each new record in a partition. When the SQL Server ROW NUMBER function detects two identical values in the same partition, it assigns different rank numbers to both.
ROW_NUMBER() Function without Partition By clausePartition by clause is an optional part of Row_Number function and if you don't use it all the records of the result-set will be considered as a part of single record group or a single partition and then ranking functions are applied.
The row_number() window function can be used without order by in over to arbitrarily assign a unique value to each row.
I've written about this a few times actually; ROW_NUMBER
is by far the most flexible and easy-to-use, and performance is good, but for extremely large data sets it is not always the best. SQL Server still needs to sort the data and the sort can get pretty expensive.
There's a different approach here that uses a couple of variables and SET ROWCOUNT
and is extremely fast, provided that you have the right indexes. It's old, but as far as I know, it's still the most efficient. Basically you can do a totally naïve SELECT
with SET ROWCOUNT
and SQL Server is able to optimize away most of the real work; the plan and cost ends up being similar to two MAX
/MIN
queries, which is usually a great deal faster than even a single windowing query. For very large data sets this runs in less than 1/10th the time.
Having said that, I still always recommend ROW_NUMBER
when people ask about how to implement things like paging or groupwise maximums, because of how easy it is to use. I would only start looking at alternatives like the above if you start to notice slowdowns with ROW_NUMBER
.
Recently, I used paging in a data warehouse environment with a star schema. I found that the performance was very good when I restricted the CTE to only query the rows necessary to determine the ROW_NUMBER
. I had the CTE return the ROW_NUMBER
plus the primary keys of the other rows that helped determine the row number.
In the main query, I referenced the ROW_NUMBER
for paging, and then joined to the other tables based on the other primary keys from the CTE. I found that the joins were only performed on the rows that satisfied the WHERE
clause in the outer query, saving a great deal of time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With