Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Equivalence of query and method (lambda) syntax of a Join with Where clause

My simplified LINQ Join plus Where of two tables looks like this:

var join = context.Foo
  .Join(context.Bar,
    foo => new { foo.Year, foo.Month },
    bar => new { bar.Year, bar.Month },
    (foo, bar) => new { foo.Name, bar.Owner, foo.Year })
  .Where(anon => anon.Year == 2015).ToList();

Alternatively I could have used the following syntax that I hope to be equivalent:

var joinQuery = from foo in context.Foo
                join bar in context.Bar
                on new { foo.Year, foo.Month } equals new { bar.Year, bar.Month }
                where foo.Year == 2015
                select new { foo.Name, bar.Owner };
var join = joinQuery.ToList();

One difference that occurs to me and that I wonder about is the order of commands. In the lambda-syntax join I add the foo.Year property to my anonymous return type just so I can filter after, while in the other query I can still use foo (and bar if I wanted to) in the where clause. I don't need to add the field foo.Year to my return type here if I don't want or need to.

Unfortunately I don't have ReSharper or anything similar that could translate the lower statement to a lambda one so that I could compare.

What I could in fact do (and make the upper statement more similar in structure to the lower one) is add the following line between Where(..) and ToList() in the first one:

.Select(anon => new { /* the properties I want */ })

But doesn't this just add "one more" anonymous type creation compared to the 2nd statement, or am I mistaken here?

In short: What's the equivalent Join syntax to the 2nd statement? Or is the 1st one plus the added Select really equivalent, that is, does the joinQuery internally produce the same code?

like image 712
InvisiblePanda Avatar asked Mar 03 '15 08:03

InvisiblePanda


4 Answers

In the general case, you cannot always convert between query comprehension syntax and lambda syntax in exactly the same way the compiler does. This is due to the usage of transparent identifiers. But you can work around this and produce semantically equivalent lambda statements. And this is what ReSharper does.

Anyway, in your case, you can just add:

.Select(anon => new { /* the properties I want */ })

This will instantiate an anonymous type per row, but it won't be "one more", so don't worry about that: the expression is converted to SQL, so the new { foo.Year, foo.Month } statements in the join don't really instantiate these objects, they just get converted to SQL. Only the last select will both be used for the SQL SELECT list, and for object hydration once the rows are retrieved.

like image 58
Lucas Trzesniewski Avatar answered Nov 18 '22 13:11

Lucas Trzesniewski


But doesn't this just add "one more" anonymous type creation compared to the 2nd statement, or am I mistaken here?

As dotctor's answer shows: this is what the compiler is doing when you use the comprehension syntax in this case. By including the year in your single anonymous type you slightly reduce the overhead.

However:

  • Anonymous types are very lightweight: the use of generics, and that generics instantiated over reference types share implementation means there is little code (take a look at the assembly in a decompiler).
  • While lots of instances are created, for most cases they'll be cleaned up in generation 0 as they are released almost immediately.
  • The C# compiler's and JIT's optimisers have plenty to work with here. It is quite possible shortcuts are taken (but you'll need to read the x86/x64 assembly from a running process to see).

Remember the first two rules of optimisation: unless you can show – profiler data from realistic test data – you have a performance issue focus on clear code that is easy to maintain.

like image 33
Richard Avatar answered Nov 18 '22 13:11

Richard


Another alternative would be to move the where method before the join method and then leave the year out of the anonymous class:

var join = context.Foo
    .Where(foo => foo.Year == 2015)
    .Join(context.Bar,
        foo => new { foo.Year, foo.Month },
        bar => new { bar.Year, bar.Month },
        (foo, bar) => new { foo.Name, bar.Owner })
    .ToList();

But in general the other answers are right in that there is not much difference and the compiler can handle the details for you .

like image 3
Raidri Avatar answered Nov 18 '22 12:11

Raidri


for the In short: part of your question: the answer is yes.
here is resharpers result

var joinQuery = context.Foo.Join(context.Bar, foo => new
{
    foo.Year,
    foo.Month
}, bar => new
{
    bar.Year,
    bar.Month
}, (foo, bar) => new
{
    foo,
    bar
}).Where(@t => @t.foo.Year == 2015).Select(@t => new
{
    @t.foo.Name,
    @t.bar.Owner
});
like image 1
Hamid Pourjam Avatar answered Nov 18 '22 12:11

Hamid Pourjam