Thanks to pagination, we can split our large dataset into chunks ( or pages ) that we can gradually fetch and display to the user, thus reducing the load on the database. Pagination also solves a lot of performance issues both on the client and server-side!
The Good part To do pagination in SQL, we get the first page by requesting the first 10 records with LIMIT 10 . For the next page, we introduce OFFSET by scanning past the first 10 records with OFFSET 10 and then performing the same LIMIT 10 request as in the first query.
Trying to give you a brief answer to your doubt, if you execute the skip(n).take(m)
methods on linq (with SQL 2005 / 2008 as database server) your query will be using the Select ROW_NUMBER() Over ...
statement, with is somehow direct paging in the SQL engine.
Giving you an example, I have a db table called mtcity
and I wrote the following query (work as well with linq to entities):
using (DataClasses1DataContext c = new DataClasses1DataContext())
{
var query = (from MtCity2 c1 in c.MtCity2s
select c1).Skip(3).Take(3);
//Doing something with the query.
}
The resulting query will be:
SELECT [t1].[CodCity],
[t1].[CodCountry],
[t1].[CodRegion],
[t1].[Name],
[t1].[Code]
FROM (
SELECT ROW_NUMBER() OVER (
ORDER BY [t0].[CodCity],
[t0].[CodCountry],
[t0].[CodRegion],
[t0].[Name],
[t0].[Code]) AS [ROW_NUMBER],
[t0].[CodCity],
[t0].[CodCountry],
[t0].[CodRegion],
[t0].[Name],
[t0].[Code]
FROM [dbo].[MtCity] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p0 + @p1
ORDER BY [t1].[ROW_NUMBER]
Which is a windowed data access (pretty cool, btw cuz will be returning data since the very begining and will access the table as long as the conditions are met). This will be very similar to:
With CityEntities As
(
Select ROW_NUMBER() Over (Order By CodCity) As Row,
CodCity //here is only accessed by the Index as CodCity is the primary
From dbo.mtcity
)
Select [t0].[CodCity],
[t0].[CodCountry],
[t0].[CodRegion],
[t0].[Name],
[t0].[Code]
From CityEntities c
Inner Join dbo.MtCity t0 on c.CodCity = t0.CodCity
Where c.Row Between @p0 + 1 AND @p0 + @p1
Order By c.Row Asc
With the exception that, this second query will be executed faster than the linq result because it will be using exclusively the index to create the data access window; this means, if you need some filtering, the filtering should be (or must be) in the Entity listing (where the row is created) and some indexes should be created as well to keep up the good performance.
Now, whats better?
If you have pretty much solid workflow in your logic, implementing the proper SQL way will be complicated. In that case LINQ will be the solution.
If you can lower that part of the logic directly to SQL (in a stored procedure), it will be even better because you can implement the second query I showed you (using indexes) and allow SQL to generate and store the Execution Plan of the query (improving performance).
Try using
FROM [TableX]
ORDER BY [FieldX]
OFFSET 500 ROWS
FETCH NEXT 100 ROWS ONLY
to get the rows from 501 to 600 in the SQL server, without loading them in memory. Note that this syntax has become available with SQL Server 2012 only
While LINQ-to-SQL will generate an OFFSET
clause (possibly emulated using ROW_NUMBER() OVER()
as others have mentioned), there is an entirely different, much faster way to perform paging in SQL. This is often called the "seek method" as described in this blog post here.
SELECT TOP 10 first_name, last_name, score
FROM players
WHERE (score < @previousScore)
OR (score = @previousScore AND player_id < @previousPlayerId)
ORDER BY score DESC, player_id DESC
The @previousScore
and @previousPlayerId
values are the respective values of the last record from the previous page. This allows you to fetch the "next" page. If the ORDER BY
direction is ASC
, simply use >
instead.
With the above method, you cannot immediately jump to page 4 without having first fetched the previous 40 records. But often, you do not want to jump that far anyway. Instead, you get a much faster query that might be able to fetch data in constant time, depending on your indexing. Plus, your pages remain "stable", no matter if the underlying data changes (e.g. on page 1, while you're on page 4).
This is the best way to implement paging when lazy loading more data in web applications, for instance.
Note, the "seek method" is also called keyset paging.
LinqToSql will automatically convert a .Skip(N1).Take(N2) into the TSQL syntax for you. In fact, every "query" you do in Linq, is actually just creating a SQL query for you in the background. To test this, just run SQL Profiler while your application is running.
The skip/take methodology has worked very well for me, and others from what I read.
Out of curiosity, what type of self-paging query do you have, that you believe is more efficient than Linq's skip/take?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With