Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

EF Takes Forever to Generate this Query

I have a parent-child table relationship. In a repository, I'm doing this:

return (from p in _ctx.Parents  
.Include( "Children" )  
select p).AsQueryable<Parent>();  

Then in a filter, I want to filter the parent by a list of child ids:

IQueryable<Parent> qry;  // from above
List<int> ids;  // huge list (8500)
var filtered =
from p in qry.Where( p => p.Children.Any(c => ids.Contains(c.ChildId)) ) select s;  

My list of ids is huge. This generates a simple SQL statement that does have a huge list of ids "in (1,2,3...)", but it takes no appreciable time to run by itself. EF, however, takes about a full minute just to generate the statement. I proved this by setting a breakpoint and calling:

((ObjectQuery<Parent>)filtered).ToTraceString();

This takes all the time. Is the problem in my last linq statement? I don't know any other way to do the equivalent of Child.ChildId in (ids). And even if my linq statement is bad, how in the world should this take so long?

like image 399
dudeNumber4 Avatar asked Aug 17 '10 15:08

dudeNumber4


2 Answers

Unfortunately, building queries in Linq to Entities is a pretty heavy hit, but I've found it usually saves time due to the ability to build queries from their component pieces before actually hitting the database.

It is likely that the way they implement the Contains method uses an algorithm that assumes that Contains is generally used for a relatively small set of data. According to my tests, the amount of time it takes per ID in the list begins to skyrocket at around 8000.

So it might help to break your query into pieces. Group them into groups of 1000 or less, and concatenate a bunch of Where expressions.

var idGroups = ids.GroupBy(i => i / 1000);
var q = Parents.Include("Children").AsQueryable();
var newQ = idGroups.Aggregate(q, 
    (s, g) => s.Concat(
                  q.Where(w => w.Children.Any(wi => g.Contains(wi.ChildId)))));

This speeds things up significantly, but it might not be significantly enough for your purposes, in which case you'll have to resort to a stored procedure. Unfortunately, this particular use case just doesn't fit into "the box" of expected Entity Framework behavior. If your list of ids could begin as a query from the same Entity Context, Entity Framework would have worked fine.

like image 107
StriplingWarrior Avatar answered Oct 08 '22 09:10

StriplingWarrior


Re-write your query in Lambda syntax and it will cut the time by as much as 3 seconds (or at least it did for my EF project).

return _ctx.Parents.Include( "Children" ).AsQueryable<Parent>();  

and

IQueryable<Parent> qry;  // from above
List<int> ids;  // huge list (8500)
var filtered = qry.Where( p => p.Children.Any(c => ids.Contains(c.ChildId)) );
like image 30
Nate Zaugg Avatar answered Oct 08 '22 07:10

Nate Zaugg