I want get n-th to m-th records in a table, what's best choice in 2 below solutions:
Solution 1:
SELECT * FROM Table WHERE ID >= n AND ID <= m
Solution 2:
SELECT * FROM (SELECT *, ROW_NUMBER() OVER (ORDER BY ID) AS row FROM Table )a WHERE row >= n AND row <= m
The ROW_NUMBER function cannot currently be used in a WHERE clause. Derby does not currently support ORDER BY in subqueries, so there is currently no way to guarantee the order of rows in the SELECT subquery.
According to the optimizer, in my case the ROW_NUMBER is about 60% more efficient according to the subtree cost. And according to statistics IO, about 20% less CPU time. However, in real elapsed time, the ROW_NUMBER solution takes about 80% more real time. So the GROUP BY wins in my case.
ROW_NUMBER function is a SQL ranking function that assigns a sequential rank number to each new record in a partition. When the SQL Server ROW NUMBER function detects two identical values in the same partition, it assigns different rank numbers to both.
The ROW_NUMBER() is a window function that assigns a sequential integer to each row within the partition of a result set. The row number starts with 1 for the first row in each partition. The following shows the syntax of the ROW_NUMBER() function: ROW_NUMBER() OVER ( [PARTITION BY partition_expression, ... ]
As other already pointed out, the queries return different results and are comparing apples to oranges.
But the underlying question remains: which is faster: keyset driven paging or rownumber driven paging?
Keyset driven paging relies on remembering the top and bottom keys of the last displayed page, and requesting the next or previous set of rows, based on the top/last keyset:
Next page:
select top (<pagesize>) ... from <table> where key > @last_key_on_current_page order by key;
Previous page:
select top (<pagesize>) from <table> where key < @first_key_on_current_page order by key desc;
This approach has two main advantages over the ROW_NUMBER approach, or over the equivalent LIMIT approach of MySQL:
However, this approach is difficult to implement, hard to understand by the average programmer and not supported by the tools.
This is the common approach introduced with Linq queries:
select ... from ( select ..., row_number() over (...) as rn from table) where rn between @firstRow and @lastRow;
(or a similar query using TOP) This approach is easy to implement and is supported by tools (specifically by Linq .Limit and .Take operators). But this approach is guaranteed to scan the index in order to count the rows. This approach works usually very fast for page 1 and gradually slows down as the an one goes to higher and higher page numbers.
As a bonus, with this solution is very easy to change the sort order (simply change the OVER clause).
Overall, given the ease of the ROW_NUMBER() based solutions, the support they have from Linq, the simplicity to use arbitrary orders for moderate data sets the ROW_NUMBER based solutions are adequate. For large and very large data sets, the ROW_NUMBER() can occur serious performance issues.
One other thing to consider is that often times there is a definite pattern of access. Often the first few pages are hot and pages after 10 are basically never viewed (eg. most recent posts). In this case, the penalty that occurs with ROW_NUMBER() for visiting bottom pages (display pages for which a large number of rows have to be counted to get the starting result row) may be well ignored.
And finally, the keyset pagination is great for dictionary navigation, which ROW_NUMBER() cannot accommodate easily. Dictionary navigation is where instead of using page number, users can navigate to certain anchors, like alphabet letters. Typical example being a contact Rolodex like sidebar, you click on M and you navigate to the first customer name that starts with M.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With