Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to prefer joins expressed with SelectMany() over joins expressed with the join keyword in Linq

Tags:

join

linq

Linq allows to express inner joins by using the join keyword or by using SelectMany() (i.e. a couple of from keywords) with a where keyword:

var personsToState = from person in persons
                     join state in statesOfUS
                     on person.State equals state.USPS
                     select new { person, State = state.Name };
foreach (var item in personsToState)
{
    System.Diagnostics.Debug.WriteLine(item);
}

// The same query can be expressed with the query operator SelectMany(), which is
// expressed as two from clauses and a single where clause connecting the sequences.                     
var personsToState2 = from person in persons
                      from state in statesOfUS
                      where person.State == state.USPS
                      select new { person, State = state.Name };
foreach (var item in personsToState2)
{
    System.Diagnostics.Debug.WriteLine(item);
}

My question: when is it purposeful to use the join-style and when to use the where-style, has one style performance advantages over the other style?

like image 295
Nico Avatar asked Oct 26 '10 16:10

Nico


2 Answers

For local queries Join is more efficient due to its keyed lookup as Athari mentioned, however for LINQ to SQL (L2S) you'll get more mileage out of SelectMany. In L2S a SelectMany ultimately uses some type of SQL join in the generated SQL depending on your query.

Take a look at questions 11 & 12 of the LINQ Quiz by Joseph/Ben Albahari, authors of C# 4.0 In a Nutshell. They show samples of different types of joins and they state:

With LINQ to SQL, SelectMany-based joins are the most flexible, and can perform both equi and non-equi joins. Throw in DefaultIfEmpty, and you can do left outer joins as well!

In addition, Matt Warren has a detailed blog post on this topic as it pertains to IQueryable / SQL here: LINQ: Building an IQueryable provider - Part VII.

Back to your question of which to use, you should use whichever query is more readable and allows you to easily express yourself and construct your end goal clearly. Performance shouldn't be an initial concern unless you are dealing with large collections and have profiled both approaches. In L2S you have to consider the flexibility SelectMany offers you depending on the way you need to pair up your data.

like image 98
Ahmad Mageed Avatar answered Jan 04 '23 12:01

Ahmad Mageed


Join is more efficient, it uses Lookup class (a variation of Dictionary with multiple values for a single key) to find matching values.

like image 45
Athari Avatar answered Jan 04 '23 13:01

Athari