I am trying to GroupJoin
some data with an IQueryable
and project that data into an anonymous type. The original entity that I am GroupJoin
ing onto has an ICollection
navigation property (ie. one:many). I want to eager load that property so I can access it after the group join without EF going back to the DB. I know that Include()
doesn't work when you use a GroupJoin
, but the following code is the only way I have found to make it eager load the collection (ContactRoomRoles
):
using (var context = new MyDbContext()) {
var foundRooms = context.Rooms.Include(rm => rm.ContactRoomRoles);
foundRooms.ToList(); // <-- Required to make EF actually load ContactRoomRoles data!
var roomsData = foundRooms
.GroupJoin(
context.Contacts,
rm => rm.CreatedBy,
cont => cont.Id,
(rm, createdBy) => new {
ContactRoomRoles = rm.ContactRoomRoles,
Room = rm,
CreatedBy = createdBy.FirstOrDefault()
}
)
.ToList();
var numberOfRoles1 = roomsData.ElementAt(1).Room.ContactRoomRoles.Count();
var numberOfRoles2 = roomsData.ElementAt(2).Room.ContactRoomRoles.Count();
var numberOfRoles3 = roomsData.ElementAt(3).Room.ContactRoomRoles.Count();
}
If I remove the foundRooms.ToList()
, EF goes off to the database 3 times to populate my numberOfRoles
variables at the end, but with foundRooms.ToList()
it doesn't - it just eager loads the data in one query upfront.
Although this works, it feels like a total hack. I'm just calling .ToList()
for the side-effect of making EF actually load the collection data. If I comment that line out, it goes to the database any time I try to access ContactRoomRoles
. Is there a less hacky way to make EF eager load that navigation property?
NOTE: I want to use the navigation property rather than projecting it into a new property of the anonymous type because AutoMapper wants to access Room.ContactRoomRoles
when it's mapping onto a DTO object.
This is not a hack. This is an abstraction leak. We should be ready to meet abstraction leaks using ORM tools (and any other internal DSL).
After ToList()
you not only execute actual sql call (and load data into memory) but also cross to other Linq flavor - "Linq for objects". After this all your calls of Count()
doesn't generate sql just because you start working with in memory collections (not with expression trees those are hidden by IQueryable
- the return type of GroupBy
statement, but with List
collection - return type of ToList).
Without ToList()
you stay with "Linq for sql" and EF will translate each call of Count()
on IQuerybale to sql; Three Conut() call = three underlined Sql statements.
There are no way to avoid this, otherwise then to calculate all count(*)
values on server side in one complex query. If you will try to write such query with Linq (constructing expression tree
) - you will meet abstraction leak again. ORM tool is designed to map objects to "RDBS entities" staying with CRUD (Create Read Update Delete) operations - if statement become more complex - you will be not able to foresee generated sql (and all runtime exceptions like 'can't generate sql for such linq'). So do not use linq for complex 'report like' queries (in some cases you could - it depends on your re-usage requirements and testing possibilities). Use old good SQL and call it through ADO or EF ADO "sql extensions" like EF Core FromSql
:
var blogs = context.Blogs
.FromSql("EXECUTE dbo.GetMostPopularBlogsForUser {0}", user)
.ToList();
Update: it is a good recommendation also to avoid using lazy loading and manual entities loading if you are not working on reusable EF tools. They are in some sense opposite to linq queries - expression trees. They were important (if not only one) option to achieve referenced entities loading on "old" platforms where were no "expression trees" in language but in .NET/EF where full queries can be written "declarative way" as expression trees without execution (but with postponed interpretation) there should be very strong reason to return back to "manual" loading.
It's all about collections that are marked as loaded, or not.
The line
foundRooms.ToList();
(or foundRooms.Load()
)
loads all Room
s and their ContactRoomRoles
collections into the context. Since the Include
statement is used, these collections are marked as loaded by EF. You can check that by looking at
context.Entry(Rooms.Local.First()).Collection(r => r.ContactRoomRoles).IsLoaded
which should return true
.
If you omit the line foundRooms.ToList();
, each time a Room.ContactRoomRoles
collection is accessed, EF will notice it's not marked as loaded yet and will lazy-load it. After that, the collection is marked as loaded, but it took an extra query.
A collection is only marked as loaded when it is -
Include
-edloaded by the Load()
statement, as in
context.Entry(Rooms.Local.First()).Collection(r => r.ContactRoomRoles).Load();
Not when it is part of a projection into another property (like the part ContactRoomRoles = rm.ContactRoomRole
in your query).
However, after the statement var roomsData = foundRooms (...).ToList()
all Room.ContactRoomRoles
are populated, because the query did load them into the context, and EF's always executes the relationship fixup process, which auto-populates navigation properties.
So, to summarize, after your query you have roomsData
containing room objects with ContactRoomRoles
collections that are populated but not marked as loaded.
Knowing this, it's apparent now that the only thing to do is: prevent lazy loading to occur.
The best way to achieve that is to prevent EF from creating entity objects that are capable of lazy loading, aka proxies. You do that by adding the line
context.Configuration.ProxyCreationEnabled = false;
just below the using
statement.
Now you'll notice that the line
var numberOfRoles1 = roomsData.ElementAt(1).Room.ContactRoomRoles.Count();
doesn't trigger an extra query, but does return the correct count.
This is called an Abstraction Leak and it means your abstraction exposes some implementation details.
This is happening when you call the .ToList()
and you switch (I don't like the word cross) between Linq to sql and Linq to objects.
I'd recommend you to read The Law of Leaky Abstractions to get the grasp better, as it is quite complicated to explain on one foot.
The main idea behind it is, that everything will work as planned but slower then usual when you attempt to provide a complete abstraction of an underlying unreliable layer, but sometimes, the layer leaks through the abstraction and you feel the things that the abstraction can’t quite protect you from.
Edit to clarify:
calling ToList()
forces linq-to-entities to evaluate and return the results as a list.
Meaning that, for example from the answer above:
var blogs = context.Blogs
.FromSql("EXECUTE dbo.GetMostPopularBlogsForUser {0}", user)
.ToList();
Will be evaluation to the corresponding model of the context - blogs model.
So in other words, it is being lazily executed at the moment that you call ToList()
.
Prior to the ToList()
call, C# does NO SQL calls. So actually, it is NOT an in-memory operation.
So yes, it is putting that data into memory as part of the context and reads it in the same context.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With