Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Curious slowness of EF vs SQL

In a heavily multi-threaded scenario, I have problems with a particular EF query. It's generally cheap and fast:

Context.MyEntity
  .Any(se => se.SameEntity.Field == someValue        
     && se.AnotherEntity.Field == anotherValue
     && se.SimpleField == simpleValue
     // few more simple predicates with fields on the main entity
     );

This compiles into a very reasonable SQL query:

SELECT 
CASE WHEN ( EXISTS (SELECT 
    1 AS [C1]
    FROM   (SELECT [Extent1].[Field1] AS [Field1]
        FROM  [dbo].[MyEntity] AS [Extent1]
        INNER JOIN [dbo].[SameEntity] AS [Extent2] ON [Extent1].[SameEntity_Id] = [Extent2].[Id]
        WHERE (N'123' = [Extent2].[SimpleField]) AND (123 = [Extent1].[AnotherEntity_Id]) AND -- further simple predicates here -- ) AS [Filter1]
    INNER JOIN [dbo].[AnotherEntity] AS [Extent3] ON [Filter1].[AnotherEntity_Id1] = [Extent3].[Id]
    WHERE N'123' = [Extent3].[SimpleField]
)) THEN cast(1 as bit) ELSE cast(0 as bit) END AS [C1]
FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]

The query, in general, has optimal query plan, uses the right indices and returns in tens of milliseconds which is completely acceptable.

However, when a critical number of threads (<=40) starts executing this query, the performance on it drops to tens of seconds.

There are no locks in the database, no queries are writing data to these tables and it reproduces very well with a database that's practically isolated from any other operations. The DB resides on the same physical machine and the machine is not overloaded at any point, i.e. has plenty of spare CPU, memory and other resources the CPU is overloaded by this operation.

Now what's really bizarre is that when I replace the EF Any() call with Context.Database.ExecuteSqlCommand() with the copy-pasted SQL (also using parameters), the problem magically disappears. Again, this reproduces very reliably - replacing the Any() call with copy-pasted SQL increases the performance by 2-3 orders of magnitude .


An attached profiler (dotTrace) sampling shows that the threads seem to all spend their time in the following method:

dotTrace sample

Is there anything I've missed or did we hit some ADO.NET / SQL Server cornercase?


MORE CONTEXT

The code running this query is a Hangfire job. For the purpose of test, a script queues a lot of jobs to be performed and up to 40 threads keep processing the job. Each job uses a separate DbContext instance and it's not really being used a lot. There are a few more queries before and after the problematic query and they take expected times to execute.

We're using many different Hangfire jobs for similar purposes and they behave as expected. Same with this one, except when it gets slow under high concurrency (of exact same jobs). Also, just switching to SQL on this particular query fixes the problem.

The profiling snapshot above is representative, all the threads slow down on this particular method call and spend the vast majority of their time on it.


UPDATE

I'm currently re-running a lot of those checks for sanity and errors. The easy reproduction means it's still on a remote machine to which I can't connect using VS for debugging.

One of the checks showed that my previous statement about free CPU was false, the CPU was not entirely overloaded but multiple cores were in fact running on full capacity for the whole duration of the long running jobs.

Re-checking everything again and will come back with updates here.

like image 599
Jacek Gorgoń Avatar asked Sep 02 '16 15:09

Jacek Gorgoń


People also ask

Why is EF so slow?

Answer. Entity Framework loads very slowly the first time because the first query EF compiles the model. If you are using EF 6.2, you can use a Model Cache which loads a prebuilt edmx when using code first; instead, EF generates it on startup.

Why is my SQL query so slow?

Queries can become slow for various reasons ranging from improper index usage to bugs in the storage engine itself. However, in most cases, queries become slow because developers or MySQL database administrators neglect to monitor them and keep an eye on their performance.

Which SQL is faster?

SQL Server 2017 is the fastest database everywhere you need it. Whether it is your laptop, in your private cloud, or in our Azure public cloud infrastructure. Whether it is running on Linux, Windows, or Docker Containers, we have the speed to power any workload your application needs.


3 Answers

Can you try as shown below and see whether is there any performance improvement or not ...

Context.MyEntity.AsNoTracking()
  .Any(se => se.SameEntity.Field == someValue        
     && se.AnotherEntity.Field == anotherValue
     && se.SimpleField == simpleValue
    );
like image 110
Sampath Avatar answered Sep 19 '22 13:09

Sampath


Check if you are reusing the context in a loop. Doing so may create many objects during your performance test and giving the garbage collector a lot of work to do.

like image 27
acai Avatar answered Sep 18 '22 13:09

acai


Faulty initial assumptions. The SQL in the question was obtained by pasting the code into LINQPad and having it generate the SQL.

After attaching an SQL profiler to the actual DB used, it showed a slightly different SQL involving outer joins, which are suboptimal and didn't have a proper index in place.

It remains a mystery why LINQPad generated different SQL, even though it's using the same EntityFramework.dll, but the original problem is resolved and all that remains is to optimize the query.

Many thanks for everyone involved.

like image 44
Jacek Gorgoń Avatar answered Sep 20 '22 13:09

Jacek Gorgoń