In a table T with 20 million rows and a unique constraint on integer column uk, the following query induces SQL Server 2019 to perform a scan of all index entries instead of a single seek:
select max(uk)
from T
where uk <= @x
@x has a value that is close to the maximum for column uk, but that should be beside the point. The engine should always do a single seek in a situation like that, period.
Curiously enough, in SSMS the plan for this query looks exactly like the plan on the development server where the query works as intended, in every detail. The only difference is the actual number of rows read, which is 1 on the development server and 20 million on staging.
Note: the table sizes are similar but their query histories are wildly divergent. And, yes, statistics were updated during the bug hunt, but to no avail.
For now I've reformulated the query like this in order for production to proceed:
select top 1 uk
from T
where uk <= @x
order by 1 desc
That results in the expected single seek, and the execution plan is only marginally less efficient than the plan with MAX() would normally be.
Even though the problem is fixed for this particular query, our application contains many more subsidiary queries just like it and I don't relish the idea that any one of them could suddenly start misbehaving like this one.
What could possibly induce such an egregious mis-fire of the SQL engine? How can we guard against it?
Here is the requested table structure:
create table hlg.Anfragen
(
tx_utc datetime2(4) not null, -- vom Server vereindeutigt, da gleichzeitig Primärschlüssel
tx_id int not null, -- als externe Korrelations-Id (kann später evtl. entfallen)
-- ... more fields here (avg. row size 46.2 bytes in Disk Usage per Table report)
constraint PK_hlg_Anfragen_TxUTC
primary key clustered (tx_utc) with (fillfactor = 100),
constraint UK_hlg_Anfragen_TxId
unique (tx_id)
)
UPDATE: looking at the XML execution plan shows ScanDirection="BACKWARD" as expected on the development server but ScanDirection="FORWARD" on staging, even though the predicate is 'LE' in both cases and all the other details seem to be the same as well (like Index="[UK_hlg_Anfragen_TxId]").
The XML execution plan for an invocation of the stored procedure is now at https://pastetheplan.com. I have not been able to reproduce the problem with minimal ad hoc queries, only by executing the stored procedure. The offending query is the third one from the top.
This does look like a parameter sniffing issue.
In your case it estimates that the WHERE tx_id <= @BisTxId will return 1 row anyway. Not the 19,821,936 it actually reads.
So therefore it believes the cost of that seek operator is so low anyway there is no cost benefit of applying the ScalarGbAggToTop transformation. This will be based on the parameter value "sniffed" when the execution plan is initially compiled.
You could add a RECOMPILE hint to get the plan compiled on each execution so it takes account of the actual runtime parameter value or add an OPTION (OPTIMIZE FOR (@BisTxId UNKNOWN)) hint, so it just uses a generic guess rather than estimates based on the first parameter values seen, but if you are changing the query text anyway probably will be best just to rewrite the query (as you have already done) so the logical specification more closely matches your desired plan shape and you are not dependent on this transformation being applied.
An example repro showing the parameter sniffing impact is at DB Fiddle
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With