Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent NULL checks in LINQ to Entity Joins

We have a table called Student. That table has a field called Homeroom, where the value is a room number of the student's homeroom. The value can be null.

We have a second table called Staff. That table also has a field called Homeroom to indicate which homeroom the teacher is assigned to. The value can be null.

But when the student's Homeroom is null, a Staff record should not be returned.

We used to take advantage of the fact that checking two null fields for equality always returns false in SQL. Through SQL, this is how we would get the data we want:

SELECT STUDENT.ID, STAFF.NAME as [Homeroom Teacher]
FROM STUDENT
LEFT OUTER JOIN STAFF ON
    STAFF.BUILDING = STUDENT.BUILDING AND
    STAFF.HOMEROOM = STUDENT.HOMEROOM

Student would be returned, but no teacher.

We are using Entity Framework with Code First POCO objects. So, we have a Student object and a Staff object. When we recreate this SQL in LINQ:

from student in repo.GetStudents()
join homeroomTeacher in repo.GetStaff()
    new { student.Building, Room = student.Homeroom }
     equals new { homeroomTeacher.Building, Room = homeroomTeacher.Homeroom }
into roj2
from homeroomTeacherRoj in roj2.DefaultIfEmpty()
select student.Id, homeroomTeacherRoj.Name;

The SQL generated contains a NULL check on both Homeroom fields:

SELECT STUDENT.ID, STAFF.NAME
FROM STUDENT AS [Extent1]
LEFT OUTER JOIN [dbo].[STAFF] AS [Extent2] ON 
    ([Extent1].[BUILDING] = [Extent2].[BUILDING]) AND 
    (
        ([Extent1].[HOMEROOM] = [Extent2].[HOMEROOM]) OR 
        (([Extent1].[HOMEROOM] IS NULL) AND ([Extent2].[HOMEROOM] IS NULL))
    )

This will return the student, and any staff who does not have a homeroom defined. That's not what we wanted or expected based on how we previously wrote our SQL statements.

An obvious way around it is to make sure we don't include staff that do not have a homeroom (join homeroomTeacher in repo.GetStaff().Where(staff => staff.Homeroom != null). But is there another way in the LINQ to prevent null checks on fields when joining them?

like image 504
scott.korin Avatar asked Dec 20 '22 23:12

scott.korin


1 Answers

if you move your join into a where clause, then the following setting on the DbContext object will turn off the (EF 6 introduced) NULL checking behaviour:

Context.Configuration.UseDatabaseNullSemantics = true;

So, to "Join" in a where clause, you can split out the query into 2 IQueryable objects

var subquery = from homeroomTeacher in repo.GetStaff()
               where ...
               select homeroomTeacher;

var query = from student in repo.GetStudents()
            where subquery.Any(homeroomTeacher => 
                 homeroomTeacher.xxx == student.xxx) -- simplified join for demo code
            select student;

fyi, the UseDatabaseNullSemantics was introduced to fix-up this behaviour but it looks like they forgot the JOIN semantics and only applied it to the WHERE semantics.

This original statement was wrong - EF 4.3.1 exhibited the same JOIN behaviour:

This essentially means some result sets are now different for EF 6, compared to previous versions. This, in my mind, IS A BIG DEAL !!!! as it introduced bugs into my working solutions !!!!

I have raised an issue on codeplex:https://entityframework.codeplex.com/workitem/2006

like image 125
Simon Dowdeswell Avatar answered Jan 07 '23 19:01

Simon Dowdeswell