Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Horrifically inefficient query generated by Entity Framework 6

Here's the query I want:

select top 10 *
from vw_BoosterTargetLog
where OrganizationId = 4125
order by Id desc

It executes subsecond.

Here's my Entity Framework (6.1.2) equivalent in C#:

return await db.vw_BoosterTargetLog
    .Where(x => x.OrganizationId == organizationId)
    .OrderByDescending(x => x.Id)
    .Take(numberToRun)
    .ToListNolockAsync();

And here's the SQL that it generates:

SELECT TOP (10) 
    [Project1].[OrganizationId] AS [OrganizationId], 
    [Project1].[BoosterTriggerId] AS [BoosterTriggerId], 
    [Project1].[IsAutomatic] AS [IsAutomatic], 
    [Project1].[C1] AS [C1], 
    [Project1].[CustomerUserId] AS [CustomerUserId], 
    [Project1].[SourceUrl] AS [SourceUrl], 
    [Project1].[TargetUrl] AS [TargetUrl], 
    [Project1].[ShowedOn] AS [ShowedOn], 
    [Project1].[ClickedOn] AS [ClickedOn], 
    [Project1].[BoosterTargetId] AS [BoosterTargetId], 
    [Project1].[TriggerEventGroup] AS [TriggerEventGroup], 
    [Project1].[TriggerIgnoreIdentifiedUsers] AS [TriggerIgnoreIdentifiedUsers], 
    [Project1].[TargetTitle] AS [TargetTitle], 
    [Project1].[BoosterTargetVersionId] AS [BoosterTargetVersionId], 
    [Project1].[Version] AS [Version], 
    [Project1].[CookieId] AS [CookieId], 
    [Project1].[CoalescedId] AS [CoalescedId], 
    [Project1].[OrganizationName] AS [OrganizationName], 
    [Project1].[ShowedOnDate] AS [ShowedOnDate], 
    [Project1].[SampleGroupSectionName] AS [SampleGroupSectionName], 
    [Project1].[Selector] AS [Selector], 
    [Project1].[SelectorStep] AS [SelectorStep]
    FROM ( SELECT 
        [Extent1].[OrganizationId] AS [OrganizationId], 
        [Extent1].[OrganizationName] AS [OrganizationName], 
        [Extent1].[BoosterTriggerId] AS [BoosterTriggerId], 
        [Extent1].[IsAutomatic] AS [IsAutomatic], 
        [Extent1].[SampleGroupSectionName] AS [SampleGroupSectionName], 
        [Extent1].[Selector] AS [Selector], 
        [Extent1].[SelectorStep] AS [SelectorStep], 
        [Extent1].[BoosterTargetId] AS [BoosterTargetId], 
        [Extent1].[CookieId] AS [CookieId], 
        [Extent1].[CustomerUserId] AS [CustomerUserId], 
        [Extent1].[CoalescedId] AS [CoalescedId], 
        [Extent1].[SourceUrl] AS [SourceUrl], 
        [Extent1].[TriggerEventGroup] AS [TriggerEventGroup], 
        [Extent1].[TriggerIgnoreIdentifiedUsers] AS [TriggerIgnoreIdentifiedUsers], 
        [Extent1].[TargetTitle] AS [TargetTitle], 
        [Extent1].[TargetUrl] AS [TargetUrl], 
        [Extent1].[ShowedOn] AS [ShowedOn], 
        [Extent1].[ShowedOnDate] AS [ShowedOnDate], 
        [Extent1].[ClickedOn] AS [ClickedOn], 
        [Extent1].[BoosterTargetVersionId] AS [BoosterTargetVersionId], 
        [Extent1].[Version] AS [Version], 
         CAST( [Extent1].[Id] AS int) AS [C1]
        FROM (SELECT 
    [vw_BoosterTargetLog].[OrganizationId] AS [OrganizationId], 
    [vw_BoosterTargetLog].[OrganizationName] AS [OrganizationName], 
    [vw_BoosterTargetLog].[BoosterTriggerId] AS [BoosterTriggerId], 
    [vw_BoosterTargetLog].[IsAutomatic] AS [IsAutomatic], 
    [vw_BoosterTargetLog].[SampleGroupSectionName] AS [SampleGroupSectionName], 
    [vw_BoosterTargetLog].[Selector] AS [Selector], 
    [vw_BoosterTargetLog].[SelectorStep] AS [SelectorStep], 
    [vw_BoosterTargetLog].[BoosterTargetId] AS [BoosterTargetId], 
    [vw_BoosterTargetLog].[CookieId] AS [CookieId], 
    [vw_BoosterTargetLog].[CustomerUserId] AS [CustomerUserId], 
    [vw_BoosterTargetLog].[CoalescedId] AS [CoalescedId], 
    [vw_BoosterTargetLog].[Id] AS [Id], 
    [vw_BoosterTargetLog].[SourceUrl] AS [SourceUrl], 
    [vw_BoosterTargetLog].[TriggerEventGroup] AS [TriggerEventGroup], 
    [vw_BoosterTargetLog].[TriggerIgnoreIdentifiedUsers] AS [TriggerIgnoreIdentifiedUsers], 
    [vw_BoosterTargetLog].[TargetTitle] AS [TargetTitle], 
    [vw_BoosterTargetLog].[TargetUrl] AS [TargetUrl], 
    [vw_BoosterTargetLog].[ShowedOn] AS [ShowedOn], 
    [vw_BoosterTargetLog].[ShowedOnDate] AS [ShowedOnDate], 
    [vw_BoosterTargetLog].[ClickedOn] AS [ClickedOn], 
    [vw_BoosterTargetLog].[BoosterTargetVersionId] AS [BoosterTargetVersionId], 
    [vw_BoosterTargetLog].[Version] AS [Version]
    FROM [dbo].[vw_BoosterTargetLog] AS [vw_BoosterTargetLog]) AS [Extent1]
        WHERE [Extent1].[OrganizationId] = 4125
    )  AS [Project1]
    ORDER BY [Project1].[C1] DESC

It's ugly as hell, of course, as all EF queries are: I'm not complaining about that. My gripe is that in my testing, best-case, it executes about 10x slower than the first, and worst-case, about 100x slower.

For a query this simple, that seems way beyond all reasonable expectation.

Obviously I can execute SQL directly, or execute a sproc, or something of that sort. And while I'm waiting for feedback, that's what I'll do. But does anyone have any other suggestions about how to speed this up? Is there any way to encourage EF to generate reasonable SQL in a situation like this?

like image 346
Ken Smith Avatar asked Dec 25 '22 22:12

Ken Smith


1 Answers

The queries EF produces, while terrible from a readability perspective, are usually still quite good reasonable -- and I say that as someone who does almost all data access through stored procedures with hand-written queries. But in order for it to work, the model EF has of the database needs to match the actual database, or else conversions will be introduced, and when that happens it's very easy to get horrible performance drops while all the data is converted and no indexes can be used.

If we eliminate some nesting, the EF query can be simplified to

SELECT TOP (10) *
FROM (
    SELECT *, CAST(Id AS INT) AS C1
    FROM vw_BoosterTargetLog
    WHERE OrganizationId = 4125
) _
ORDER BY C1 DESC

(This is not the actual result set because Id isn't part of the final result set in the real query, but pretend I wrote out all the columns just like EF did.)

If vw_BoosterTargetLog.Id is not actually an INT, this forces a conversion of all rows before the ordering takes place, which is much slower. The solution is to figure out the actual type of the column (in this case, BIGINT) and update your model accordingly.

like image 59
Jeroen Mostert Avatar answered Jan 01 '23 03:01

Jeroen Mostert