Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

one-to-many projected LINQ query executes repeatedly

I am projecting LINQ to SQL results to strongly typed classes: Parent and Child. The performance difference between these two queries is large:

Slow Query - logging from the DataContext shows that a separate call to the db is being made for each parent

var q = from p in parenttable
        select new Parent()
        {
            id = p.id,
            Children = (from c in childtable
                        where c.parentid = p.id
                        select c).ToList()
        }
return q.ToList()  //SLOW

Fast Query - logging from the DataContext shows a single db hit query that returns all required data

var q = from p in parenttable
        select new Parent()
        {
            id = p.id,
            Children = from c in childtable
                        where c.parentid = p.id
                        select c
        }
return q.ToList()  //FAST

I want to force LINQ to use the single-query style of the second example, but populate the Parent classes with their Children objects directly. otherwise, the Children property is an IQuerierable<Child> that has to be queried to expose the Child object.

The referenced questions do not appear to address my situation. using db.LoadOptions does not work. perhaps it requires the type to be a TEntity registered with the DataContext.

   DataLoadOptions options = new DataLoadOptions();
   options.LoadWith<Parent>(p => p.Children);
   db.LoadOptions = options;

Please Note: Parent and Child are simple types, not Table<TEntity> types. and there is no contextual relationship between Parent and Child. the subqueries are ad-hoc.

The Crux of the Issue: in the 2nd LINQ example I implement IQueriable statements and do not call ToList() function and for some reason LINQ knows how to generate one single query that can retrieve all the required data. How do i populate my ad-hoc projection with the actual data as is accomplished in the first query? Also, if anyone could help me better-phrase my question, I would appreciate it.

like image 280
Paul Avatar asked Apr 26 '13 15:04

Paul


2 Answers

It's important to remember that LINQ queries rely in deferred execution. In your second query you aren't actually fetching any information about the children. You've created the queries, but you haven't actually executed them to get the results of those queries. If you were to iterate the list, and then iterate the Children collection of each item you'd see it taking as much time as the first query.

Your query is also inherently very inefficient. You're using a nested query in order to represent a Join relationship. If you use a Join instead the query will be able to be optimized appropriately by both the query provider as well as the database to execute much more quickly. You may also need to adjust the indexes on your database to improve performance. Here is how the join might look:

var q = from p in parenttable
        join child in childtable
        on p.id equals child.parentid into children
        select new Parent()
        {
            id = p.id,
            Children = children.ToList(),
        }
return q.ToList()  //SLOW
like image 69
Servy Avatar answered Oct 19 '22 23:10

Servy


The fastest way I found to accomplish this is to do a query that returns all the results then group all the results. Make sure you do a .ToList() on the first query, so that the second query doesn't do many calls.

Here r should have what you want to accomplish with only a single db query.

            var q = from p in parenttable
                    join c in childtable on p.id equals c.parentid
                    select c).ToList();
            var r = q.GroupBy(x => x.parentid).Select(x => new { id = x.Key, Children=x });
like image 35
zeal Avatar answered Oct 19 '22 22:10

zeal