I'm a big fan of Linq, and I have been really enjoying the power of expression trees etc. But I have found that whenever I try to get too clever with my queries, I hit some kind of limitation in the framework: while the query can take a very short time to run on the database (as shown by performance analyzer), the results take ages to materialize. When that happens I know I've been too fancy, and I start breaking the query up into smaller, bite sized chunks - so I have a solution for that, though it might not always be the most optimal.
But I'd like to understand:
EDIT: As requested in comments, here's an example of a query that I measured to run on SQL Server in a few seconds, but took almost 2 minutes to materialize. I'm not going to try explaining all the stuff in context; it's here just so you can view the constructs and see an example of what I'm talking about:
Expression<Func<Staff, TeacherInfo>> teacherInfo =
st => new TeacherInfo
{
ID = st.ID,
Name = st.FirstName + " " + st.LastName,
Email = st.Email,
Phone = st.TelMobile,
};
var step1 =
currentReportCards.AsExpandable()
.GroupJoin(db.ScholarReportCards,
current =>
new { current.ScholarID, current.AcademicTerm.AcademicYearID },
past => new { past.ScholarID, past.AcademicTerm.AcademicYearID },
(current, past) => new
{
Current = current,
PastCards =
past.Where(
rc =>
rc.AcademicTerm.StartDate <
current.AcademicTerm.StartDate &&
rc.AcademicTerm.Grade == current.AcademicTerm.Grade &&
rc.AcademicTerm.SchoolID == current.AcademicTerm.SchoolID)
});
// This materialization is what takes a long time:
var subjects = step1.SelectMany(x => from key in x.Current.Subjects
.Select(s => new { s.Subject.SubjectID, s.Subject.SubjectCategoryID })
.Union(x.PastCards.SelectMany(c => c.Subjects)
.Select(
s => new { s.Subject.SubjectID, s.Subject.SubjectCategoryID }))
join cur in x.Current.Subjects on key equals
new { cur.Subject.SubjectID, cur.Subject.SubjectCategoryID } into jcur
from cur in jcur.DefaultIfEmpty()
join past in x.PastCards.SelectMany(p => p.Subjects) on key equals
new { past.Subject.SubjectID, past.Subject.SubjectCategoryID } into past
select new
{
x.Current.ScholarID,
IncludeInContactSection =
// ReSharper disable ConstantNullCoalescingCondition
(bool?)cur.Subject.IncludeInContactSection ?? false,
IncludeGrades = (bool?)cur.Subject.IncludeGrades ?? true,
// ReSharper restore ConstantNullCoalescingCondition
SubjectName =
cur.Subject.Subject.Name ?? past.FirstOrDefault().Subject.Subject.Name,
SubjectCategoryName = cur.Subject.SubjectCategory.Description,
ClassInfo = (from ce in myDb.ClassEnrollments
.Where(
ce =>
ce.Class.SubjectID == cur.Subject.SubjectID
&& ce.ScholarID == x.Current.ScholarID)
.Where(enrollmentExpr)
.OrderByDescending(ce => ce.TerminationDate ?? DateTime.Today)
let teacher = ce.Class.Teacher
let secTeachers = ce.Class.SecondaryTeachers
select new
{
ce.Class.Nickname,
Primary = teacherInfo.Invoke(teacher),
Secondaries = secTeachers.AsQueryable().AsExpandable()
.Select(ti => teacherInfo.Invoke(ti))
})
.FirstOrDefault(),
Comments = cur.Comments
.Select(cc => new
{
Staff = cc.Staff.FirstName + " "
+ cc.Staff.LastName,
Comment = cc.CommentTemplate.Text ??
cc.CommentFreeText
}),
// ReSharper disable ConstantNullCoalescingCondition
DisplayOrder = (byte?)cur.Subject.DisplayOrder ?? (byte)99,
// ReSharper restore ConstantNullCoalescingCondition
cur.Percentile,
cur.Score,
cur.Symbol,
cur.MasteryLevel,
PastScores = past.Select(p => new
{
p.Score,
p.Symbol,
p.MasteryLevel,
p.ScholarReportCard
.AcademicTermID
}),
Assessments = cur.Assessments
.Select(a => new
{
a.ScholarAssessment.AssessmentID,
a.ScholarAssessment.Assessment.Description,
a.ScholarAssessment.Assessment.Type.Nickname,
a.ScholarAssessment.AssessmentDate,
a.ScoreDesc,
a.ScorePerc,
a.MasteryLevel,
a.ScholarAssessment.Assessment.Type.AssessmentFormat,
a.ScholarAssessment.PublishedStatus,
a.ScholarAssessment.FPScore,
a.ScholarAssessment.TotalScore,
a.ScholarAssessment.Assessment.Type.ScoreType,
a.ScholarAssessment.Assessment.Type.OverrideBelowLabel,
a.ScholarAssessment.Assessment.Type.OverrideApproachingLabel,
a.ScholarAssessment.Assessment.Type.OverrideMeetingLabel,
a.ScholarAssessment.Assessment.Type.OverrideExceedingLabel,
})
})
.ToList();
Linq uses deferred execution for some tasks, for example while iterating through an IEnumerable<>
, so what you call materialization includes some actual data fetching.
var reportCards = db.ScholarReportCards.Where(cr => ...); // this prepares the query
foreach (var rc in reportCards) {} // this executes your query and calls the DB
I think that if you trace/time queries on your SQL server you may see some queries arriving during the "materialization" step. This problem may even be exacerbated by anti-patterns such as the "Select N+1" problem : for example it looks like you're not including the AcademicTerm
objects in your request; if you don't resolving these will result in a select N+1, that is for every ScholarReportCard
there will be a call to the DB to lazily resolve the AcademicTerm
attached.
If we focus on the Linq to DB aspect, at least try not to :
Include
the related datatables you will needInclude
on the table you need)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With