Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linq To Objects - Under The Hood Of Joins

Tags:

c#

linq

I would like to know what are the differences between those two linq statements?

What is faster?

Are they the same?

What is the difference between this statement

from c in categories
from p in products
where c.cid == p.pid
select new { c.cname, p.pname };

and this statement?

from c in categories
join p in products on c.cid equals p.pid
select new { c.cname, p.pname };

Thanks in advance guys.

EDIT: In context of LINQ to Objects

like image 568
ninja hedgehog Avatar asked Jun 25 '13 07:06

ninja hedgehog


People also ask

Can we do Joins in LINQ?

In a LINQ query expression, join operations are performed on object collections. Object collections cannot be "joined" in exactly the same way as two relational tables. In LINQ, explicit join clauses are only required when two source sequences are not tied by any relationship.

What is GroupJoin in LINQ?

The group join is useful for producing hierarchical data structures. It pairs each element from the first collection with a set of correlated elements from the second collection. For example, a class or a relational database table named Student might contain two fields: Id and Name .

What is SelectMany in LINQ C#?

SelectMany(<selector>) method The SelectMany() method is used to "flatten" a sequence in which each of the elements of the sequence is a separate, subordinate sequence.

How use outer join in LINQ?

A left outer join is a join in which each element of the first collection is returned, regardless of whether it has any correlated elements in the second collection. You can use LINQ to perform a left outer join by calling the DefaultIfEmpty method on the results of a group join.


1 Answers

Okay, within LINQ to Objects the difference can be very dramatic.

The first form examines every c and p pair, checks for c.cid being equal to p.pid and yields matches.

The second form (within Join) first creates a hash-based lookup from pid to matching Product elements. Then it streams the categories, and then checks for each category where there are matching Product elements in the lookup based on the c.cid. This is generally much more efficient as it only needs to look through products once and create the hash-based lookup. On the other hand, it has a higher memory footprint. This is all done somewhat lazily of course - it's only when you ask for the first result that anything significant happens.

For more details on the Join operation, see my Edulinq blog post on the topic.

like image 153
Jon Skeet Avatar answered Sep 28 '22 11:09

Jon Skeet