Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between where and join?

Tags:

c#

linq

What is the difference between

var q_nojoin = from o in one
               from t in two
               where o.SomeProperty == t.SomeProperty
               select new { o, t };

and

var q_join = from o in one
             join t in two on o.SomeProperty equals t.SomeProperty
             select new { o, t };

They seem to give me the same results.

like image 717
Difference Engine Avatar asked Sep 15 '10 07:09

Difference Engine


1 Answers

They give the same result, but the join is very much faster, unless you use LINQ to SQL so that the database can optimise the queries.

I made a test with two arrays containing 5000 items each, and the query with a join was about 450 times faster (!) than the query without a join.

If you use LINQ to SQL, the database will optimise both queries to do the same job, so there is no performance difference in that case. However, an explicit join is considered more readable.

If you are using LINQ against a different data source, there is no optimising layer, so there is a significant difference in how the queries work. The join uses a hash table or similar to quickly look up matching values, while the query without a join will compare all items in one table with each item in the other table. The complexity of the join is roughly O(n+m), while the complexity of the query without the join is O(n*m). This means not only that the query without the join is slower, but also that it scales badly, so as the data grows it will get exponentially slower.

like image 153
Guffa Avatar answered Nov 06 '22 06:11

Guffa