Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between using Join in Linq and "Olde Style" pre ANSI join syntax?

This question follows on from a question I asked yesterday about why using the join query on my Entities produced horrendously complicated SQL. It seemed that performing a query like this:

        var query = from ev in genesisContext.Events
                    join pe in genesisContext.People_Event_Link
                    on ev equals pe.Event
                    where pe.P_ID == key
                    select ev;

Produced the horrible SQL that took 18 seconds to run on the database, whereas joining the entities through a where clause (sort of like pre-ANSI SQL syntax) took less than a second to run and produced the same result

        var query = from pe in genesisContext.People_Event_Link
                    from ev in genesisContext.Events
                    where pe.P_ID == key && pe.Event == ev
                    select ev;

I've googled all over but still don't understand why the second is produces different SQL to the first. Can someone please explain the difference to me? When should I use the join keyword


This is the SQL that was produced when I used Join in my query and took 18 seconds to run:

SELECT 
1 AS [C1], 
[Extent1].[E_ID] AS [E_ID], 
[Extent1].[E_START_DATE] AS [E_START_DATE], 
[Extent1].[E_END_DATE] AS [E_END_DATE], 
[Extent1].[E_COMMENTS] AS [E_COMMENTS], 
[Extent1].[E_DATE_ADDED] AS [E_DATE_ADDED], 
[Extent1].[E_RECORDED_BY] AS [E_RECORDED_BY], 
[Extent1].[E_DATE_UPDATED] AS [E_DATE_UPDATED], 
[Extent1].[E_UPDATED_BY] AS [E_UPDATED_BY], 
[Extent1].[ET_ID] AS [ET_ID], 
[Extent1].[L_ID] AS [L_ID]
FROM  [dbo].[Events] AS [Extent1]
INNER JOIN [dbo].[People_Event_Link] AS [Extent2] ON  EXISTS (SELECT 
    1 AS [C1]
    FROM    ( SELECT 1 AS X ) AS [SingleRowTable1]
    LEFT OUTER JOIN  (SELECT 
            [Extent3].[E_ID] AS [E_ID]
        FROM [dbo].[Events] AS [Extent3]
        WHERE [Extent2].[E_ID] = [Extent3].[E_ID] ) AS [Project1] ON 1 = 1
    LEFT OUTER JOIN  (SELECT 
        [Extent4].[E_ID] AS [E_ID]
        FROM [dbo].[Events] AS [Extent4]
        WHERE [Extent2].[E_ID] = [Extent4].[E_ID] ) AS [Project2] ON 1 = 1
    WHERE ([Extent1].[E_ID] = [Project1].[E_ID]) OR (([Extent1].[E_ID] IS NULL) AND ([Project2].[E_ID] IS NULL))
)
WHERE [Extent2].[P_ID] = 291

This is the SQL that was produce using the ANSI Style syntax (and is fairly close to what I would write if I were writing the SQL myself):

SELECT * FROM Events AS E INNER JOIN People_Event_Link AS PE ON E.E_ID=PE.E_ID INNER JOIN PEOPLE AS P ON P.P_ID=PE.P_ID
WHERE P.P_ID = 291
like image 332
Calanus Avatar asked Aug 06 '09 12:08

Calanus


1 Answers

Neither of the above queries are entirely "correct." In EF, it is generally correct to use the relationship properties in lieu of either of the above. For example, if you had a Person object with a one to many relationship to PhoneNumbers in a property called Person.PhoneNumbers, you could write:

var q = from p in Context.Person
        from pn in p.PhoneNumbers
        select pn;

The EF will build the join for you.

In terms of the question above, the reason the generated SQL is different is because the expression trees are different, even though they produce equivalent results. Expression trees are mapped to SQL, and you of course know that you can write different SQL which produces the same results but with different performance. The mapping is designed to produce decent SQL when you write a farily "conventional" EF query.

But the mapping is not so smart as to take a very unconventional query and optimize it. In your first query, you state that the objects must be equivalent. In the second, you state that the ID property must be equivalent. My sample query above says "just get the details for this one record." The EF is designed to work with the way I show, principally, but also handles scalar equivalence well.

like image 189
Craig Stuntz Avatar answered Nov 15 '22 05:11

Craig Stuntz