Update This may already be fixed: http://entityframework.codeplex.com/workitem/486
...
A fairly straightforward LINQ statement against my entities is resulting in unnecessarily complex SQL. More on that later, here's the setup:
Tables
Publication
Receipt
LINQ
var query = from r in context.Receipts.Include("Publication")
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& r.Publication.TopicId == topicEntity.TopicId
&& r.Publication.ReceiptCount > 1
select r;
SQL
exec sp_executesql N'SELECT TOP (25)
[Project1].[ReceiptId] AS [ReceiptId],
[Project1].[PublicationId] AS [PublicationId],
[Project1].[DateInserted] AS [DateInserted],
[Project1].[DateReceived] AS [DateReceived],
[Project1].[PublicationId1] AS [PublicationId1],
[Project1].[PayloadId] AS [PayloadId],
[Project1].[TopicId] AS [TopicId],
[Project1].[BrokerType] AS [BrokerType],
[Project1].[DateInserted1] AS [DateInserted1],
[Project1].[DateProcessed] AS [DateProcessed],
[Project1].[DateUpdated] AS [DateUpdated],
[Project1].[PublicationGuid] AS [PublicationGuid],
[Project1].[ReceiptCount] AS [ReceiptCount]
FROM ( SELECT
[Extent1].[ReceiptId] AS [ReceiptId],
[Extent1].[PublicationId] AS [PublicationId],
[Extent1].[DateInserted] AS [DateInserted],
[Extent1].[DateReceived] AS [DateReceived],
[Extent3].[PublicationId] AS [PublicationId1],
[Extent3].[PayloadId] AS [PayloadId],
[Extent3].[TopicId] AS [TopicId],
[Extent3].[BrokerType] AS [BrokerType],
[Extent3].[DateInserted] AS [DateInserted1],
[Extent3].[DateProcessed] AS [DateProcessed],
[Extent3].[DateUpdated] AS [DateUpdated],
[Extent3].[PublicationGuid] AS [PublicationGuid],
[Extent3].[ReceiptCount] AS [ReceiptCount]
FROM [dbo].[Receipt] AS [Extent1]
INNER JOIN [dbo].[Publication] AS [Extent2] ON [Extent1].[PublicationId] = [Extent2].[PublicationId]
LEFT OUTER JOIN [dbo].[Publication] AS [Extent3] ON [Extent1].[PublicationId] = [Extent3].[PublicationId]
WHERE ([Extent2].[ReceiptCount] > 1) AND ([Extent1].[DateInserted] < @p__linq__0) AND ([Extent1].[ReceiptId] > @p__linq__1) AND ([Extent2].[TopicId] = @p__linq__2)
) AS [Project1]
ORDER BY [Project1].[ReceiptId] ASC',N'@p__linq__0 datetime,@p__linq__1 int,@p__linq__2 int',@p__linq__0='2012-09-05 19:39:21:510',@p__linq__1=4458824,@p__linq__2=90
Problem
Publication gets joined twice:
.Include("Publication")
where
.If I remove [Extent2] from the SQL entirely, and change the WHERE bits to use [Extent3], I get the same results back. Since I'm not using Lazy Loading on my entities, I have to .Include("Publication")
... is there any solution for this?
I was using EF4, but grabbed EF5 from NuGet to see if it was perhaps fixed, but it produces the same result (although I have no idea how to tell if my EDMX is really using EF5).
As you may know, it is used to join and combine data from two or more tables into one common data set. In this article, I'm going to discuss special types of joins? in which you combine the same table twice—including joining a table to itself, also known as the self join.
You use a single table twice in a query by giving it two names, like that. The aliases are often introduced with the keyword AS. You also normally specify a join condition (for without it, you get the Cartesian Product of the table joined with itself). For preference you use the explicit JOIN notation.
You use self-join to create a result set that joins the rows with the other rows within the same table. Because you cannot refer to the same table more than one in a query, you need to use a table alias to assign the table a different name when you use self-join.
The LINQ join operator allows us to join multiple tables on one or more columns (multiple columns). By default, they perform the inner join of the tables. We also learn how to perform left joins in Entity Framework by using the join operator & DefaultIfEmpty method.
There is however, a work-around. It may not be the most elegant solution, but it does exactly what you want; it generates only one join.
Change:
var query = from r in context.Receipts.Include("Publication")
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& r.Publication.TopicId == topicEntity.TopicId
&& r.Publication.ReceiptCount > 1
select r;
To be:
var query = from r in context.Receipts
join pub in context.Publication on r.PublicationId equals pub.PublicationId
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& pub.TopicId == topicEntity.TopicId
&& pub.ReceiptCount > 1
select new {
Receipt = r,
Publication = pub
};
Note that we have removed the Include AND we are no longer using r.Publication.?? in the where clause. Instead we are using pub.??
Now when you loop through query, you will see that r.Publication is not null:
foreach ( var item in query)
{
//see that item.Publication is not null
if(item.Receipt != null && item.Receipt.Publication != null)
{
//do work based on a valid Publication
}
else
{
//do work based on no linked Publication
}
}
This behavior can be avoided by using temporary variables (eg, let
pub = r.Publication).
var query = from r in context.Receipts
let pub = r.Publication // using a temp variable
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& pub.TopicId == topicEntity.TopicId
&& pub.ReceiptCount > 1
select new { r, pub };
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With