I am projecting LINQ to SQL results to strongly typed classes: Parent and Child. The performance difference between these two queries is large:
Slow Query - logging from the DataContext shows that a separate call to the db is being made for each parent
var q = from p in parenttable
select new Parent()
{
id = p.id,
Children = (from c in childtable
where c.parentid = p.id
select c).ToList()
}
return q.ToList() //SLOW
Fast Query - logging from the DataContext shows a single db hit query that returns all required data
var q = from p in parenttable
select new Parent()
{
id = p.id,
Children = from c in childtable
where c.parentid = p.id
select c
}
return q.ToList() //FAST
I want to force LINQ to use the single-query style of the second example, but populate the Parent classes with their Children objects directly. otherwise, the Children property is an IQuerierable<Child>
that has to be queried to expose the Child object.
The referenced questions do not appear to address my situation. using db.LoadOptions does not work. perhaps it requires the type to be a TEntity registered with the DataContext.
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Parent>(p => p.Children);
db.LoadOptions = options;
Please Note: Parent and Child are simple types, not Table<TEntity>
types. and there is no contextual relationship between Parent and Child. the subqueries are ad-hoc.
The Crux of the Issue: in the 2nd LINQ example I implement IQueriable
statements and do not call ToList()
function and for some reason LINQ knows how to generate one single query that can retrieve all the required data. How do i populate my ad-hoc projection with the actual data as is accomplished in the first query? Also, if anyone could help me better-phrase my question, I would appreciate it.
It's important to remember that LINQ queries rely in deferred execution. In your second query you aren't actually fetching any information about the children. You've created the queries, but you haven't actually executed them to get the results of those queries. If you were to iterate the list, and then iterate the Children
collection of each item you'd see it taking as much time as the first query.
Your query is also inherently very inefficient. You're using a nested query in order to represent a Join
relationship. If you use a Join
instead the query will be able to be optimized appropriately by both the query provider as well as the database to execute much more quickly. You may also need to adjust the indexes on your database to improve performance. Here is how the join might look:
var q = from p in parenttable
join child in childtable
on p.id equals child.parentid into children
select new Parent()
{
id = p.id,
Children = children.ToList(),
}
return q.ToList() //SLOW
The fastest way I found to accomplish this is to do a query that returns all the results then group all the results. Make sure you do a .ToList() on the first query, so that the second query doesn't do many calls.
Here r
should have what you want to accomplish with only a single db query.
var q = from p in parenttable
join c in childtable on p.id equals c.parentid
select c).ToList();
var r = q.GroupBy(x => x.parentid).Select(x => new { id = x.Key, Children=x });
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With