We have an existing body of C# code that calls parameterized ad-hoc SQL Server queries in many places. We never specify SqlParameter.Size, and it's documented that in this case, the SqlParameter class infers the size from the parameter value. We just recently became aware of the SQL Server plan cache pollution issues this creates, where a separate plan is cached for each distinct combination of parameter sizes.
Luckily, whenever we create a SqlParameter, we do so via a single utility method, so we have the opportunity to add a few lines to that method and make this problem go away. We are thinking about adding the following:
if((sqlDbType == SqlDbType.VarChar) || (sqlDbType == SqlDbType.NVarChar))
m_sqlParam.Size = -1;
In other words, every time we pass a varchar parameter, pass it as a varchar(max). Based on some quick testing, this works fine, and we can see (via SQL Profiler and sys.dm_exec_cached_plans) that there is now a single plan in the cache for each ad-hoc query, and the type of the string parameter(s) is now varchar(max).
This seems like such an easy solution that there must be some hidden, performance-destroying downside. Is anyone aware of one?
(Please note that we only need to support SQL Server 2008 and later.)
Yes, there is a hidden, performance-destroying downside!
Many thanks to Martin Smith, whose answer (see below) pointed me to the right way to analyze this. I tested with our application's Users table, which has an Email column defined as nvarchar(100), and has a non-clustered index (IX_Users_Email) on the Email column. I modified Martin's example query as follows:
declare @a nvarchar(max) = cast('a' as nvarchar(max))
--declare @a nvarchar(100) = cast('a' as nvarchar(100))
--declare @a nvarchar(4000) = cast('a' as nvarchar(4000))
select Email from Users where Email = @a
Depending which of the "declare" statements I un-comment, I get a VERY different query plan. The nvarchar(100) and nvarchar(4000) versions both give me an index seek on IX_Users_Email -- in fact, any length I specify gives me the same plan. The nvarchar(max) version, on the other hand, gives me an index scan on IX_Users_Email, followed by a Filter operator to apply the predicate.
That's enough for me -- if there is any possibility of getting a scan rather than a seek, then this "cure" is worse than the disease.
New Proposal
I noticed that every time SQL Server parameterizes a query with a varchar parameter, the cached plan simply uses varchar(8000) (or nvarchar(4000)) for the parameter. I figure if it's good enough for SQL Server, it's good enough for me! Replacing the C# code in my original question (above) with:
if(sqlDbType == SqlDbType.VarChar)
m_sqlParam.Size = 8000;
else if(sqlDbType == SqlDbType.NVarChar)
m_sqlParam.Size = 4000;
This seems to solve the plan cache pollution problem without the same impact on the query plans as using a Size of -1. However, I have not done a lot of testing with this, and I am very interested to hear anyone's input on this revised approach.
We had to modify the prior version (New Proposal, above) to handle the case where the parameter value is longer than the maximum. At that point, you have no choice but to make it a varchar(max):
if((sqlDbType == SqlDbType.VarChar) || (sqlDbType == SqlDbType.NVarChar))
{
m_sqlParam.Size = (sqlDbType == SqlDbType.VarChar) ? 8000 : 4000;
if((value != null) && !(value is DBNull) && (value.ToString().Length > m_sqlParam.Size))
m_sqlParam.Size = -1;
}
We've been using this version for about six months with no problems.
It is not ideal in that it is best to specify a parameter that matches the datatype of the column(s) involved.
You would need to check your query plans to see if they still look reasonable.
Trying the following test
CREATE TABLE #T
(
X VARCHAR(10) PRIMARY KEY
)
DECLARE @A VARCHAR(MAX) = CAST('A' AS VARCHAR(MAX))
SELECT *
FROM #T
WHERE X = @A
Gives a plan like
SQL Server adds a compute scalar to the plan which calls the internal function GetRangeWithMismatchedTypes
and still manages to perform an index seek (more details on implicit conversions here).
A counter example where it does matter is shown in the article Why Doesn’t Partition Elimination Work?. The behaviour described in that article also applies for a varchar(max)
parameter against a table partitioned on a varchar(n)
column.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With