Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiency of Linq GroupJoin vs. Linq All in Select

Tags:

c#

linq

Does Linq use any sorting or other mechanisms to make a group join more efficient so it doesn't have to loop through an entire collection for every unmatched item?

In other words, Is this:

var x = listA.GroupJoin(
listB, a => a.prop,
b => b.prop,
(a, b) => new { a, b })
.Where(!x.b.Any()).Select(x => x.a);

more efficient than this:

var x = listA.Where(a => listB.All(b => b.prop != a.prop));
like image 266
Daniel Avatar asked Oct 18 '22 02:10

Daniel


1 Answers

I guess the question is about LINQ to Objects, i.e. Enumerable.GroupJoin. So yes, the LINQ implementation of the GroupJoin (as well as Join) is using one of the most efficient general purpose lookup data structures - hash table. It can be seen in the reference source and also is mentioned in the documentation (although not directly) inside the Remarks section:

If comparer is null, the default equality comparer, Default, is used to hash and compare keys.

Since hash lookup has O(1) time complexity, the complexity of the join operation is O(N) while in the second case it is O(N * M), so the join is definitely much more efficient.

like image 159
Ivan Stoev Avatar answered Oct 21 '22 06:10

Ivan Stoev