I have a simple paged linq query against one entity:
var data = (from t in ctx.ObjectContext.Widgets
where t.CampaignId == campaignId &&
t.CalendarEventId == calendarEventId
(t.RecurringEventId IS NULL OR t.RecurringEventId = recurringEventId)
select t);
data = data.OrderBy(t => t.Id);
if (page > 0)
{
data = data.Skip(rows * (page - 1)).Take(rows);
}
var l = data.ToList();
I expected it to generate SQL similar to:
select top 50 * from Widgets w where CampaignId = xxx AND CalendarEventId = yyy AND (RecurringEventId IS NULL OR RecurringEventId = zzz) order by w.Id
When I run the above query in SSMS, it returns quickly (had to rebuild my indexes first).
However, the generated SQL is different. It contains a nested query as shown below:
SELECT TOP (50)
[Project1].[Id] AS [Id],
[Project1].[CampaignId] AS [CampaignId]
<redacted>
FROM ( SELECT [Project1].[Id] AS [Id],
[Project1].[CampaignId] AS [CampaignId],
<redacted>,
row_number() OVER (ORDER BY [Project1].[Id] ASC) AS [row_number]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[CampaignId] AS [CampaignId],
<redacted>
FROM [dbo].[Widgets] AS [Extent1]
WHERE ([Extent1].[CampaignId] = @p__linq__0) AND ([Extent1].[CalendarEventId] = @p__linq__1) AND ([Extent1].[RecurringEventId] = @p__linq__2 OR [Extent1].[RecurringEventId] IS NULL)
) AS [Project1]
) AS [Project1]
WHERE [Project1].[row_number] > 0
ORDER BY [Project1].[Id] ASC
The Widgets table is enormous and the inner query returns 100000s of records, causing a timeout.
Is there anything I can do to change the generation? Anything I am doing wrong?
UPDATE
I finally managed to refactor my code to return the results relatively quickly:
var data = (from t in ctx.ObjectContext.Widgets
where t.CampaignId == campaignId &&
t.CalendarEventId == calendarEventId
(t.RecurringEventId IS NULL OR t.RecurringEventId = recurringEventId)
select t)).AsEnumerable().Select((item, index) => new { Index = index, Item = item });
data = data.OrderBy(t => t.Index);
if (page > 0)
{
data = data.Where(t => t.Index >= (rows * (page - 1)));
}
data = data.Take(rows);
Note, the page > 0
logic is simply used to prevent an invalid parameter being used; it does no optimization. In fact page > 1
, while valid, does not provide any noticeable optimization for the 1st page; since the Where
is not a slow operation.
The AsNoTracking method tells Entity Framework to stop that additional work and so, it can improve the performance of your application. So, in theory, a query with AsNoTracking should perform better than without.
Because an open connection to the database consumes a valuable resource, the Entity Framework opens and closes the database connection only as needed. You can also explicitly open the connection. For more information, see Managing Connections and Transactions. Once in each application domain.
Dapper is literally much faster than Entity Framework Core considering the fact that there are no bells and whistles in Dapper. It is a straight forward Micro ORM that has minimal features as well. It is always up to the developer to choose between these 2 Awesome Data Access Technologies.
Prior SQL Server 2012, the generated SQL code is the best way to perform pagging. Yes, it is awfull and very inefficient but is the best you can do even writing your own SQL scritp by hand. There are tons of digital ink about this in the net. Just google it.
In the firt page, this can be optimized not doing Skip
and just Take
but in any other page you are f***** up.
A workarround could be to generate your own row_number in persistence (an auto-identity could work) and just do where(widget.number > (page*rows) ).Take(rows)
in code. If there is a good index in your widget.number
the query should be very fast. But, this breaks the dynamic orderBy
.
However, I can see in your code that you are ordering by widget.id
always; so, if dynamic orderBy
is not essential, this could be a valid workaround.
Will you take your own medicine?
could you ask me.
No, I will not. The best way to deal with this is having a persistence read-model in wich you can even have one table per widget orderBy field with its own widget.number
. The problem is that modeling a system with a persistence read-model just for this issue is too crazy. Having a read-model is part of the overall design of your system and requires taking it in account from the very beginning of the design and development of a system.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With