We're facing a major performance problem after upgrading EF Core 2.2 to EF Core 3.0. Imagine a simple data model with a single collection navigation property and hundreds of fields (the reality looks even darker):
public class Item
{
[Key]
public int ItemID {get;set;}
public ICollection<AddInfo> AddInfos {get;set;}
... // consisting of another 100+ properties!
}
and
public class AddInfo
{
[Key]
public int AddInfoID {get;set;}
public int? ItemID {get;set;}
public string SomePayload {get;set;}
}
During item retrieval, we're querying as followed:
...
var myQueryable = this._context.Items.Include(i => i.AddInfos).Where(**some filter**);
... // moar filters
var result = myQueryable.ToList();
Straight forward, up until this point.
In EF 2.2, fetching that queryable results in two separate queries, one for Item
and one for the AddInfo
- level. These queries usually fetch 10.000 items
and around 250.000 AddInfos
.
In EF Core 3.0 however, a single query is being generated, left-joining AddInfo
to Item
that on first glance appears to be the better option. Our Item
however needs to be fetched with all 100+ fields, which is why projecting to a smaller class or anonymous type (adding a call to the .Select(...)-method) isn't feasible. Therefore, the result set has so much redundancy in it (each Item
approx. 25 times) that the query itself takes too long to run in an acceptable time.
Does EF-Core 3.0 provide any option that would enable us to switch back to the query-behavior of the good old EF Core 2.2 times without extensive changes to our data model? We're already profiting from this change in other parts of the application, but not in this particular scenario.
Many thanks in advance!
Update
After further investigation I found that this issue is already adressed with Microsoft here and out of the box, there seems to be no way to configure the split query execution.
Cartesian Explosion problem is when you load more and more one-to-many relationships, the amount of duplicate data grows and large number of rows are returned. This will have a large impact on the performance of your application. As you can see from the example, this could get really large very quickly.
Entity Framework (EF) Core, Microsoft's object-to-database mapper library for . NET Framework, brings performance improvements for data updates in version 7, Microsoft claims. The performance of SaveChanges method in EF7 is up to 74% faster than in EF6, in some scenarios. While EF is widely used in .
Lazy loading means that the related data is transparently loaded from the database when the navigation property is accessed.
As per the update of my initial question, the insights went so far as to assure myself that currenty, there is in fact no built in configuration to return to split query execution.
However, MS provided code samples on how to do this with minimal code changes (for our use case!) here.
We're simply removing the .Include(...) calls to collection navigation properties (1:n relations in our case, 1:1 relations are not affected!). After fetching the items, we're simply making another call using:
...
var myQueryable = this._context.Items.Where(**some filter**);
... // moar filters
var result = myQueryable.ToList();
...
var addInfos = myQueryable.Include(x => x.AddInfos).SelectMany(x => x.AddInfos).Select(x => new {x.ItemID, x}).ToList();
This fetches the collection navigation property entities and - if change tracking is enabled - automatically fills the collections on the individual items in the result
variable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With