Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.Include() vs .Load() performance in EntityFramework

When querying a large table where you need to access the navigation properties later on in code (I explicitly don't want to use lazy loading) what will perform better .Include() or .Load()? Or why use the one over the other?

In this example the included tables all only have about 10 entries and employees has about 200 entries, and it can happen that most of those will be loaded anyway with include because they match the where clause.

Context.Measurements.Include(m => m.Product)                     .Include(m => m.ProductVersion)                     .Include(m => m.Line)                     .Include(m => m.MeasureEmployee)                     .Include(m => m.MeasurementType)                     .Where(m => m.MeasurementTime >= DateTime.Now.AddDays(-1))                     .ToList(); 

or

Context.Products.Load(); Context.ProductVersions.Load(); Context.Lines.Load(); Context.Employees.Load(); Context.MeasurementType.Load();  Context.Measurements.Where(m => m.MeasurementTime >= DateTime.Now.AddDays(-1))                     .ToList(); 
like image 822
Staeff Avatar asked Oct 11 '13 13:10

Staeff


People also ask

What is include in Entity Framework?

Entity Framework Classic Include The Include method lets you add related entities to the query result. In EF Classic, the Include method no longer returns an IQueryable but instead an IncludeDbQuery that allows you to chain multiple related objects to the query result by using the AlsoInclude and ThenInclude methods.

Why we use include in LINQ query?

Introduction to LINQ Include. LINQ include helps out to include the related entities which loaded from the database. It allows retrieving the similar entities to be read from database in a same query. LINQ Include() which point towards similar entities must read from the database to get in a single query.

Is Entity Framework slow?

Entity Framework loads very slowly the first time because the first query EF compiles the model. If you are using EF 6.2, you can use a Model Cache which loads a prebuilt edmx when using code first; instead, EF generates it on startup.

Is EF core faster than ef6?

EF Core 6.0 itself is 31% faster executing queries. Heap allocations have been reduced by 43%.


2 Answers

It depends, try both

When using Include(), you get the benefit of loading all of your data in a single call to the underlying data store. If this is a remote SQL Server, for example, that can be a major performance boost.

The downside is that Include() queries tend to get really complicated, especially if you have any filters (Where() calls, for example) or try to do any grouping. EF will generate very heavily nested queries using sub-SELECT and APPLY statements to get the data you want. It is also much less efficient -- you get back a single row of data with every possible child-object column in it, so data for your top level objects will be repeated a lot of times. (For example, a single parent object with 10 children will product 10 rows, each with the same data for the parent-object's columns.) I've had single EF queries get so complex they caused deadlocks when running at the same time as EF update logic.

The Load() method is much simpler. Each query is a single, easy, straightforward SELECT statement against a single table. These are much easier in every possible way, except you have to do many of them (possibly many times more). If you have nested collections of collections, you may even need to loop through your top level objects and Load their sub-objects. It can get out of hand.

Quick rule-of-thumb

Try to avoid having any more than three Include calls in a single query. I find that EF's queries get too ugly to recognize beyond that; it also matches my rule-of-thumb for SQL Server queries, that up to four JOIN statements in a single query works very well, but after that it's time to consider refactoring.

However, all of that is only a starting point.

It depends on your schema, your environment, your data, and many other factors.

In the end, you will just need to try it out each way.

Pick a reasonable "default" pattern to use, see if it's good enough, and if not, optimize to taste.

like image 63
Michael Edenfield Avatar answered Oct 10 '22 05:10

Michael Edenfield


Include() will be written to SQL as JOIN: one database roundtrip.

Each Load()-instruction is "explicitly loading" the requested entities, so one database roundtrip per call.

Thus Include() will most probably be the more sensible choice in this case, but it depends on the database layout, how often this code is called and how long your DbContext lives. Why don't you try both ways and profile the queries and compare the timings?

See Loading Related Entities.

like image 32
CodeCaster Avatar answered Oct 10 '22 05:10

CodeCaster