Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NHibernate: Why does Linq First() force only one item in all child and grandchild collections with FetchMany()

Domain Model

I've got a canonical Domain of a Customer with many Orders, with each Order having many OrderItems:

Customer

public class Customer
{
  public Customer()
  {
    Orders = new HashSet<Order>();
  }
  public virtual int Id {get;set;}
  public virtual ICollection<Order> Orders {get;set;}
}

Order

public class Order
{
  public Order()
  {
    Items = new HashSet<OrderItem>();
  }
  public virtual int Id {get;set;}
  public virtual Customer Customer {get;set;}
}

OrderItems

public class OrderItem
{
  public virtual int Id {get;set;}
  public virtual Order Order {get;set;}
}

Problem

Whether mapped with FluentNHibernate or hbm files, I run two separate queries, that are identical in their Fetch() syntax, with the exception of one including the .First() extension method.

Returns expected results:

var customer = this.generator.Session.Query<Customer>()
    .Where(c => c.CustomerID == id)
    .FetchMany(c => c.Orders)
    .ThenFetchMany(o => o.Items).ToList()[0];

Returns only a single item in each collection:

var customer = this.generator.Session.Query<Customer>()
    .Where(c => c.CustomerID == id)
    .FetchMany(c => c.Orders)
    .ThenFetchMany(o => o.Items).First();

I think I understand what's going on here, which is that the .First() method is being applied to each of the preceding statements, rather than just to the initial .Where() clause. This seems incorrect behavior to me, given the fact that First() is returning a Customer.

Edit 2011-06-17

After further research and thinking, I believe that depending on my mapping, there are two outcomes to this Method Chain:

    .Where(c => c.CustomerID == id)
    .FetchMany(c => c.Orders)
    .ThenFetchMany(o => o.Items);

NOTE: I don't think I can get subselect behavior, since I'm not using HQL.

  1. When the mapping is fetch="join" I should get a cartesian product between the Customer, Order and OrderItem tables.
  2. When the mapping is fetch="select" I should get a query for Customer, and then multiple queries each for Orders and OrderItems.

How this plays out with adding the First() method to the chain is where I lose track of what should be happening.

The SQL Query that get's issued is the traditional left-outer-join query, with select top (@p0) in front.

like image 782
rbellamy Avatar asked Jun 17 '11 01:06

rbellamy


2 Answers

The First() method is translated into SQL (T-SQL at least) as SELECT TOP 1 .... Combined with your join fetching, this will return a single row, containing one customer, one order for that customer and one item for the order. You might consider this a bug in Linq2NHibernate, but as join fetching is rare (and I think you're actually hurting your performance pulling the same Customer and Order field values across the network as part of the row for each Item) I doubt the team will fix it.

What you want is a single Customer, then all Orders for that customer and all Items for all those Orders. That happens by letting NHibernate run SQL that will pull one full Customer record (which will be a row for each Order Line) and construct the Customer object graph. Turning the Enumerable into a List and then getting the first element works, but the following will be slightly faster:

var customer = this.generator.Session.Query<Customer>()
    .Where(c => c.CustomerID == id)
    .FetchMany(c => c.Orders)
    .ThenFetchMany(o => o.Items)
    .AsEnumerable().First();

the AsEnumerable() function forces evaluation of the IQueryable created by Query and modified with the other methods, spitting out an in-memory Enumerable, without slurping it into a concrete List (NHibernate can, if it wishes, simply pull enough info out of the DataReader to create one full top-level instance). Now, the First() method is no longer applied to the IQueryable to be translated to SQL, but it is instead applied to an in-memory Enumerable of the object graphs, which after NHibernate has done its thing, and given your Where clause, should be zero or one Customer record with a hydrated Orders collection.

Like I said, I think you're hurting yourself using join fetching. Each row contains the data for the Customer and the data for the Order, joined to each distinct Line. That is a LOT of redundant data, which I think will cost you more than even an N+1 query strategy.

The best way I can think of to handle this is one query per object to retrieve that object's children. It would look like this:

var session = this.generator.Session;
var customer = session.Query<Customer>()
        .Where(c => c.CustomerID == id).First();

customer.Orders = session.Query<Order>().Where(o=>o.CustomerID = id).ToList();

foreach(var order in customer.Orders)
   order.Items = session.Query<Item>().Where(i=>i.OrderID = order.OrderID).ToList();

This requires a query for each Order, plus two at the Customer level, and will return no duplicate data. This will perform far better than a single query returning a row containing every field of the Customer and Order along with each Item, and also better than sending a query per Item plus a query per Order plus a query for the Customer.

like image 136
KeithS Avatar answered Oct 13 '22 15:10

KeithS


I'd like to update the answer with my found so that could help anybody else with the same problem.

Since you are querying the entity base on their ID, you can use .Single instead of .First or .AsEnumerable().First():

var customer = this.generator.Session.Query<Customer>()
    .Where(c => c.CustomerID == id)
    .FetchMany(c => c.Orders)
    .ThenFetchMany(o => o.Items).Single();

This will generate a normal SQL query with where clause and without the TOP 1.

In other situation, if the result has more than one Customer, exception will be thrown so it won't help if you really need the first item of a series based on condition. You have to use 2 queries, one for the first Customer and let the lazy load do the second one.

like image 31
ltvan Avatar answered Oct 13 '22 17:10

ltvan