Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any performance issue using Row_Number to implement table paging in Sql Server 2008?

I want to implement table paging using this method:

SET @PageNum = 2;
SET @PageSize = 10;

WITH OrdersRN AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY OrderDate, OrderID) AS RowNum
          ,*
      FROM dbo.Orders
)

SELECT * 
  FROM OrdersRN
 WHERE RowNum BETWEEN (@PageNum - 1) * @PageSize + 1 
                  AND @PageNum * @PageSize
 ORDER BY OrderDate ,OrderID;

Is there anything I should be aware of ? Table has millions of records.

Thx.

EDIT: After using suggested MAXROWS method for some time (which works really really fast) I had to switch back to ROW_NUMBER method because of its greater flexibility. I am also very happy about its speed so far (I am working with View having more then 1M records with 10 columns). To use any kind of query I use following modification:

PROCEDURE [dbo].[PageSelect] 
(
  @Sql nvarchar(512),
  @OrderBy nvarchar(128) = 'Id',
  @PageNum int = 1,
  @PageSize int = 0    
)
AS
BEGIN
SET NOCOUNT ON

 Declare @tsql as nvarchar(1024)
 Declare @i int, @j int

 if (@PageSize <= 0) OR (@PageSize > 10000)
  SET @PageSize = 10000  -- never return more then 10K records

 SET @i = (@PageNum - 1) * @PageSize + 1 
 SET @j = @PageNum * @PageSize

 SET @tsql = 
 'WITH MyTableOrViewRN AS
 (
  SELECT ROW_NUMBER() OVER(ORDER BY ' + @OrderBy + ') AS RowNum
     ,*
    FROM MyTableOrView
    WHERE ' + @Sql  + '

 )
 SELECT * 
  FROM MyTableOrViewRN 
  WHERE RowNum BETWEEN ' + CAST(@i as varchar) + ' AND ' + cast(@j as varchar)

 exec(@tsql)
END

If you use this procedure make sure u prevented sql injection.

like image 346
majkinetor Avatar asked Feb 22 '10 01:02

majkinetor


People also ask

Is ROW_NUMBER faster than group by?

In my experience, an aggregate (DISTINCT or GROUP BY) can be quicker then a ROW_NUMBER() approach.

Can we use ROW_NUMBER in SQL Server?

ROW_NUMBER function is a SQL ranking function that assigns a sequential rank number to each new record in a partition. When the SQL Server ROW NUMBER function detects two identical values in the same partition, it assigns different rank numbers to both.

Can we use ROW_NUMBER without partition?

ROW_NUMBER() Function without Partition By clausePartition by clause is an optional part of Row_Number function and if you don't use it all the records of the result-set will be considered as a part of single record group or a single partition and then ranking functions are applied.

Can we use ROW_NUMBER without over?

The row_number() window function can be used without order by in over to arbitrarily assign a unique value to each row.


2 Answers

I've written about this a few times actually; ROW_NUMBER is by far the most flexible and easy-to-use, and performance is good, but for extremely large data sets it is not always the best. SQL Server still needs to sort the data and the sort can get pretty expensive.

There's a different approach here that uses a couple of variables and SET ROWCOUNT and is extremely fast, provided that you have the right indexes. It's old, but as far as I know, it's still the most efficient. Basically you can do a totally naïve SELECT with SET ROWCOUNT and SQL Server is able to optimize away most of the real work; the plan and cost ends up being similar to two MAX/MIN queries, which is usually a great deal faster than even a single windowing query. For very large data sets this runs in less than 1/10th the time.

Having said that, I still always recommend ROW_NUMBER when people ask about how to implement things like paging or groupwise maximums, because of how easy it is to use. I would only start looking at alternatives like the above if you start to notice slowdowns with ROW_NUMBER.

like image 167
Aaronaught Avatar answered Oct 10 '22 10:10

Aaronaught


Recently, I used paging in a data warehouse environment with a star schema. I found that the performance was very good when I restricted the CTE to only query the rows necessary to determine the ROW_NUMBER. I had the CTE return the ROW_NUMBER plus the primary keys of the other rows that helped determine the row number.

In the main query, I referenced the ROW_NUMBER for paging, and then joined to the other tables based on the other primary keys from the CTE. I found that the joins were only performed on the rows that satisfied the WHERE clause in the outer query, saving a great deal of time.

like image 31
John Saunders Avatar answered Oct 10 '22 11:10

John Saunders