Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between Joining two different DB Context using ToList() and .AsQueryable()?

Tags:

c#

sql

linq

Case 1: I am Joined two different DB Context by ToList() method in Both Context.

Case 2: And also tried Joining first Db Context with ToList() and second with AsQueryable().

Both worked for me. All I want to know is the difference between those Joinings regarding Performance and Functionality. Which one is better ?

var users = (from usr in dbContext.User.AsNoTracking()
                  select new
                  {
                     usr.UserId,
                     usr.UserName
                  }).ToList();

 var logInfo= (from log in dbContext1.LogInfo.AsNoTracking()
               select new
               {
                   log.UserId,
                   log.LogInformation
               }).AsQueryable();

 var finalQuery= (from usr in users
                  join log in logInfo on usr.UserId equals log.UserId
                  select new
                  {
                     usr.UserName,
                     log.LogInformation
                  }.ToList();
like image 610
Karthik Arthik Avatar asked Nov 23 '15 08:11

Karthik Arthik


2 Answers

I'll elaborate answer that was given by Jehof in his comment. It is true that this join will be executed in the memory. And there are 2 reasons why it happens.

Firstly, this join cannot be performed in a database because you are joining an object in a memory (users) with a deferred query (logInfo). Based on that it is not possible to generate a query that could be send to a database. It means that before performing the actual join a deferred query is executed and all logs are retrieved from a database. To sum up, in this scenario 2 queries are executed in a database and join happens in memory. It doesn't matter if you use ToList + AsQueryable or ToList + ToList in this case.

Secondly, in your scenario this join can be performed ONLY in a memory. Even if you use AsQueryable with the first context and with the second context it will not work. You will get System.NotSupportedException exception with the message:

The specified LINQ expression contains references to queries that are associated with different contexts.

I wonder why you're using 2 DB contexts. Is it really needed? As I explained because of that you lost a possibility to take full advantage of deferred queries (lazy evaluation features).

If you really have to use 2 DB contexts, I'll consider adding some filters (WHERE conditions) to queries responsible for reading users and logs from DB. Why? For small number of records there is no problem. However, for large amount of data it is not efficient to perform joins in memory. For this purpose databases were created.

like image 121
Michał Komorowski Avatar answered Oct 03 '22 13:10

Michał Komorowski


It hasn't been explained yet why the statements actually work and why EF doesn't throw an exception that you can only use sequences of primitive types in a LINQ statement.

If you swap both lists ...

var finalQuery= (from log in logInfo
                 join usr in users on log.UserId equals usr.UserId
                 ...

EF will throw

Unable to create a constant value of type 'User'. Only primitive types or enumeration types are supported in this context.

So why does your code work?

That will become clear if we convert your statement to method syntax (which the runtime does under the hood):

users.Join(logInfo, usr => usr.UserId, log => log.UserId
            (usr,log) => new
                        {
                            usr.UserName,
                            log.LogInformation
                        }

Since users is an IEnumerable, the extension method Enumerable.Join is resolved as the appropriate method. This method accepts an IEnumerable as the second list to be joined. Therefore, logInfo is implicitly cast to IEnumerable, so it runs as a separate SQL statement before it partakes in the join.

In the version from log in logInfo join usr ..., Queryable.Join is used. Now usr is converted into an IQueryable. This turns the whole statement into one expression that EF unsuccessfully tries to translate into one SQL statement.

Now a few words on

Which one is better?

The best option is the one that does just enough to make it work. That means that

  • You can remove AsQueryable(), because logInfo already is an IQueryable and it is cast to IEnumerable anyway.
  • You can replace ToList() by AsEnumerable(), because ToList() builds a redundant intermediate result, while AsEnumerable() only changes the runtime type of users, without triggering its execution yet.
like image 26
Gert Arnold Avatar answered Oct 03 '22 14:10

Gert Arnold