I have a query that returns several thousand records, joined across several tables. In the WHERE clause, a date is checked to be no more than two months in the past. At first, the query had this date boundary set up as a variable and the WHERE used that.
DECLARE @startdate as DATE = DATEADD(MONTH, -2, SYSDATETIME())
select [....]
where dateinquestion >= @startdate
This runs just fine (returns the result expected in under 4 seconds), but I wanted to get rid of the variable declaration and move the assignment into the WHERE clause itself:
select [....]
where dateinquestion >= DATEADD(MONTH, -2, SYSDATETIME())
This runs for over a minute and eats all CPU. I killed the query to stop pushing the server, no results were given. Changing to GETDATE() (I don't need the precision from SYSDATETIME() anyway) speeds things up:
select [....]
where dateinquestion >= DATEADD(MONTH, -2, GETDATE())
Results are similar to scenario 1.
I believe this is because SYSDATETIME is evaluated on a per-row-basis because processing a row costs several nanoseconds, significant to SYSDATETIME. GETDATE however, having a higher change-threshold, is unaffected (or less affected) and does not change - or need re-evaluating - on a per-row-basis.
Can you confirm? Is my assumption concerning this behaviour correct?
I've searched for this, but couldn't find anything except this, which only concerns assigning SYSDATETIME() to a variable, not using it in a WHERE: Does SYSDATETIME() cost more than GETDATE()?
Also this, but only GETDATE is used in the example: TSQL datetimes functions in where clause
The main difference between GETDATE() and SYSDATETIME() is that GETDATE returns the current date and time as DATETIME but SYSDATETIME returns a DATETIME2 value, which is more precise.
GETDATE can be used in a SELECT statement select list or in the WHERE clause of a query.
The DATEADD() function takes three arguments: datepart , number , and date . Here, the value of datepart is day , because the unit of time you want to subtract is day. The second argument is -1 (you subtract 1 day, which is the same as adding -1 day).
GETDATE is a nondeterministic function.
The most important difference between GETDATE
and SYSDATETIME
is the type of the returned value. SYSDATETIME
is not evaluated for each row, same as GETDATE
is not evaluated for each row. They are Runtime Constant Functions See also https://dba.stackexchange.com/questions/18459/does-sql-server-evaluate-functions-once-for-every-row
What is the type of dateinquestion
column?
When you used a @startdate
variable you converted the result of SYSDATETIME
to date
. When you don't use a variable, the result of DATEADD
has different types in your examples.
To make both queries with GETDATE
and SYSDATETIME
equivalent you can cast to date
explicitly:
Compare
select [....]
where dateinquestion >= CAST(DATEADD(MONTH, -2, SYSDATETIME()) AS date)
vs
select [....]
where dateinquestion >= CAST(DATEADD(MONTH, -2, GETDATE()) as date)
I'd be surprised if these two queries ran differently.
Ideally you should cast them to the same type that dateinquestion
has.
You said that variant with SYSDATETIME
uses a lot of CPU. If dateinquestion
has type datetime
, then a possible explanation could be this. It seems that in this variant the values from dateinquestion
column are implicitly converted to datetime2(7)
type before comparison. For each row. First, it uses CPU. Second, it may prevent optimizer from using an index on this column (if there is an index).
To see what is really going on, rather than guessing, compare the actual execution plan of both variants.
By the way, the variant with variable is not equivalent to inlined variant for the optimizer. Optimizer doesn't know the value of the variable, but it knows the value of the GETDATE()
and SYSDATETIME()
, so cardinality estimations are different, which can lead to different plans and different performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With