I would be grateful if someone could explain the meaning of the term into
while using LINQ. In general, I am trying to understand how to make INNER JOIN, LEFT OUTER JOIN etc. in C#.
I have the main table Students
that stores a few foreign ID keys which then are substituted by their names when running a query. The names are read from look up tables such as Marks
, SoftwareVersions
, Departments
etc. All fields are required but MarkID
. The query I tried to build in LINQ is this:
SELECT * FROM dbo.Students
INNER JOIN dbo.Departments ON dbo.Students.DepartmentID=dbo.Departments.DepartmentID
INNER JOIN dbo.SoftwareVersions ON dbo.Students.SoftwareVersionID=dbo.SoftwareVersions.SoftwareVersionID
INNER JOIN dbo.Statuses ON dbo.Students.StatusID=dbo.Statuses.StatusID
LEFT JOIN dbo.Marks ON dbo.Students.MarkID=dbo.Marks.MarkID
WHERE dbo.Students.DepartmentID=17;
I somehow managed to get the code below worked after reading plenty of articles and watching some videos but I don't feel like I have a complete understanding of the code. The bits that confuse me are in 5th line ending with into
and then in the very next line beginning with from m ...
. I'm confused what into
does and and what really happens in from m ...
. And this is the code in LINQ:
var result = from st in dbContext.Students where st.DepartmentID == 17
join d in dbContext.Departments on st.DepartmentID equals d.DepartmentID
join sv in dbContext.SoftwareVersions on st.SoftwareVersionID equals sv.SoftwareVersionID
join stat in dbContext.Statuses on st.StatusID equals stat.StatusID
join m in dbContext.Marks on st.MarkID equals m.MarkID into marksGroup
from m in marksGroup.DefaultIfEmpty()
select new
{
student = st.StudentName,
department = p.DepartmentName,
software = sv.SoftwareVersionName,
status = st.StatusName,
marked = m != null ? m.MarkName : "-- Not marked --"
};
A left outer join is a join in which each element of the first collection is returned, regardless of whether it has any correlated elements in the second collection. You can use LINQ to perform a left outer join by calling the DefaultIfEmpty method on the results of a group join.
A left outer join is a method of combining tables. The result includes unmatched rows from only the table that is specified before the LEFT OUTER JOIN clause. If you are joining two tables and want the result set to include unmatched rows from only one table, use a LEFT OUTER JOIN clause or a RIGHT OUTER JOIN clause.
One commonly used feature of Language-Integrated Query (LINQ) is the facility to combine two sequences of related data using joins. The standard join operation provides an inner join but with a minor modification can be changed to give a left outer join.
I believe Example section from How to: Perform Left Outer Joins MSDN page is really well explained. Let's project it to your example. To quote first paragraph from the page
The first step in producing a left outer join of two collections is to perform an inner join by using a group join. (See How to: Perform Inner Joins (C# Programming Guide) for an explanation of this process.) In this example, the list of Person objects is inner-joined to the list of Pet objects based on a Person object that matches Pet.Owner.
So in your case, the first step is to perform an inner join of list of Students
objects with the list of Marks
objects based on MarkID
in Students
object matches MarkID
in Marks
object. As can be seen in the quote, inner join is being performed using group join. If you check Note
section in MSDN page on how to perform group join, you can see that
Each element of the first collection appears in the result set of a group join regardless of whether correlated elements are found in the second collection. In the case where no correlated elements are found, the sequence of correlated elements for that element is empty. The result selector therefore has access to every element of the first collection.
What this means in the context of your example, is that by using into
you have group joined
results where you have all Students
objects, and sequence of correlated elements of Marks
objects (in case there is no matching Marks
objects, the sequence is going to be empty).
Now let's go back to How to: Perform Left Outer Joins MSDN page
, in particular second paragraph
The second step is to include each element of the first (left) collection in the result set even if that element has no matches in the right collection. This is accomplished by calling DefaultIfEmpty on each sequence of matching elements from the group join. In this example, DefaultIfEmpty is called on each sequence of matching Pet objects. The method returns a collection that contains a single, default value if the sequence of matching Pet objects is empty for any Person object, thereby ensuring that each Person object is represented in the result collection.
Again, to project this to your example, DefaultIsEmpty()
is being called on each sequence of matching Marks
objects. As explained above, the method returns a collection that contains a single, default value if the sequence of matching Marks
objects is empty for any Student
object, which ensures each Student
object will be represented in the resulting collection. As a result what you have is set of elements, that contain all Student
objects, and matching Marks
object, or if there is no matching Marks
object, default value of Marks
, which in this case is null
.
what I can say is that "into MarksGroup" stores the result data of your joined tables into a temporary (application based, not database based) resultset (in sql terms: a table, so its a SELECT INTO
)
In the next line, your code then selects from Marksgroup the columns with your data (in sql terms: SELECT student, department, software, status, marked FROM Marksgroup
So basically, it's getting your data from the db, then putting it aside to "Marksgroup, and in the very next step getting Marksgroup back in your fingers to take out the data you want to use in your c# code.
Try to get rid of Marksgroup, it should be possible (haven't tested ist with your code). It should be something like this:
from st in dbContext.Students where st.DepartmentID == 17
join d in dbContext.Departments on st.DepartmentID equals d.DepartmentID
join sv in dbContext.SoftwareVersions on st.SoftwareVersionID equals sv.SoftwareVersionID
join stat in dbContext.Statuses on st.StatusID equals stat.StatusID
join m in dbContext.Marks on st.MarkID equals m.MarkID
select new
{
student = st.StudentName,
department = p.DepartmentName,
software = sv.SoftwareVersionName,
status = st.StatusName,
marked = m != null ? m.MarkName : "-- Not marked --"
};
Your second question with 'm' : This should also show a different behaviour without your temporary resultset "Marksgroup"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With