I am new to execution plans in SQL Server 2005, but this mystifies me.
When I run this code ...
((SELECT COUNT(DISTINCT StudentID)
FROM vwStudentUnitSchedules AS v
WHERE v.UnitActive = 1
AND v.UnitOutcomeCode IS NULL
AND v.UnitCode = su.UnitCode
AND v.StartDate = su.StartDate
AND v.StudentCampus = st.StudentCampus) - 1) AS ClassSize
To get class sizes, it timesout and running it generically, it takes like 30 secs
But when I run it with this slight modification ...
((SELECT COUNT(DISTINCT LTRIM(RTRIM(UPPER(StudentID))))
FROM vwStudentUnitSchedules AS v
WHERE v.UnitActive = 1
AND v.UnitOutcomeCode IS NULL
AND v.UnitCode = su.UnitCode
AND v.StartDate = su.StartDate
AND v.StudentCampus = st.StudentCampus) - 1) AS ClassSize
It runs almost instantly.
Is it because of the LTRIM() RTRIM() and UPPER() functions? Why would they make things go faster? I suppose it's because COUNT(DISTINCT is an aggregate that counts from left to right character by character? Yes StudentID is a VARCHAR(10).
Thanks
cache of the query plan would certainly impact your speed on a second run.
Just a theory if that isn't the case perhaps its down to the trim. The select distinct is to match every string if these are shorter down to a trim its less characters maybe.
depending on you database engine also see if binary_checksum would make it any faster. if this works maybe my theory is right.
((SELECT COUNT(DISTINCT BINARY_CHECKSUM(LTRIM(RTRIM(UPPER(StudentID)))))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With