I have been getting an error in a previously working stored procedure called by an SSRS report and I have traced it down to a LIKE statement in a scalar function that is called by the stored procedure, in combination with a 7000+ NVARCHAR(MAX) string. It is something similar to:
Msg 8152, Level 16, State 10, Line 14 String or binary data would be truncated.
I can reproduce it with the following code:
DECLARE @name1 NVARCHAR(MAX) = ''
DECLARE @name2 NVARCHAR(MAX) = ''
DECLARE @count INT = 4001
WHILE @count > 0
BEGIN
SET @name1 = @name1 + 'a'
SET @name2 = @name2 + 'a'
SET @count = @count - 1
END
SELECT LEN(@name1)
IF @name1 LIKE @name2
PRINT 'OK'
What's the deal? Is there anyway around this limitation, or is it there for good reason? Thanks.
You can also reproduce it without the terrible loop:
DECLARE @name1 NVARCHAR(MAX), @name2 NVARCHAR(MAX);
SET @name1 = REPLICATE(CONVERT(NVARCHAR(MAX), N'a'), 4000);
SET @name2 = @name1;
IF @name1 LIKE @name2
PRINT 'OK';
SELECT @name1 += N'a', @name2 += N'a';
IF @name1 LIKE @name2
PRINT 'OK';
Result:
OK
Msg 8152, Level 16, State 10, Line 30
String or binary data would be truncated.
In any case, the reason is clearly stated in the documentation for LIKE
(emphasis mine):
match_expression [ NOT ] LIKE pattern [ ESCAPE escape_character ]
...
pattern
Is the specific string of characters to search for in match_expression, and can include the following valid wildcard characters. pattern can be a maximum of 8,000 bytes.
And 8,000 bytes is used up by 4,000 Unicode characters.
I would suggest that comparing the first 4,000 characters is probably sufficient:
WHERE column LIKE LEFT(@param, 4000) + '%';
I can't envision any scenario where you want to compare the whole thing; how many strings contain the same first 4000 characters but then character 4001 is different? If that really is a requirement, I guess you could go to the great lengths identified in the Connect item David pointed out.
A simpler (though probably much more computationally expensive) workaround might be:
IF CONVERT(VARBINARY(MAX), @name1) = CONVERT(VARBINARY(MAX), @name2)
PRINT 'OK';
I suggest that it would be far better to fix the design and stop identifying rows by comparing large strings. Is there really no other way to identify the row you're after? This is like finding your car in the parking lot by testing the DNA of all the Dunkin Donuts cups in all the cup holders, rather than just checking the license plate.
I have the same problem right now, and I do believe my situation -where you want to compare two strings with more than 4000 characters- is a possible situation :-). In my situation, I'm collecting a lot of data from different tables in a NVARCHAR(MAX) field in a specific table, to be able to search on that data using FullText. Keeping that table in sync, is done using the MERGE statement, converting everything to NVARCHAR(MAX). So my MERGE statement would look like this:
MERGE MyFullTextTable AS target
USING (
SELECT --Various stuff from various tables, casting it as NVARCHAR(MAX)
...
) AS source (IndexColumn, FullTextColumn)
ON (target.IndexColumn = source.IndexColumn)
WHEN MATCHED AND source.FullTextColumn NOT LIKE target.FullTextColumn THEN
UPDATE SET FullTextColumn = source.FullTextColumn
WHEN NOT MATCHED THEN
INSERT (IndexColumn, FullTextColumn)
VALUES (source.IndexColumn, source.FullTextColumn)
OUTPUT -- Some stuff
This would produce errors because of the LIKE comparison when the FullText-data is bigger than 4000 characters.
So I created a function that does the comparison. Allthough it's not bullet proof, it works for me. You could also split data in blocks of 4000 characters, and compare each block, but for me (for now) comparing the first 4000 characters in combination with the length, is enough ...
So the Merge-statement would look like:
MERGE MyFullTextTable AS target
USING (
SELECT --Various stuff from various tables, casting it as NVARCHAR(MAX)
...
) AS source (IndexColumn, FullTextColumn)
ON (target.IndexColumn = source.IndexColumn)
WHEN MATCHED AND udfCompareTwoTexts(source.FullTextColumn, target.FullTextColumn) = 1 THEN
UPDATE SET FullTextColumn = source.FullTextColumn
WHEN NOT MATCHED THEN
INSERT (IndexColumn, FullTextColumn)
VALUES (source.IndexColumn, source.FullTextColumn)
OUTPUT -- Some stuff
And the function looks like:
ALTER FUNCTION udfCompareTwoTexts
(
@Value1 AS NVARCHAR(MAX),
@Value2 AS NVARCHAR(MAX)
)
RETURNS BIT
AS
BEGIN
DECLARE @ReturnValue AS BIT = 0
IF LEN(@Value1) > 4000 OR LEN(@Value2) > 4000
BEGIN
IF LEN(@Value1) = LEN(@Value2) AND LEFT(@Value1, 4000) LIKE LEFT(@Value2, 4000)
SET @ReturnValue = 1
ELSE
SET @ReturnValue = 0
END
ELSE
BEGIN
IF @Value1 LIKE @Value2
SET @ReturnValue = 1
ELSE
SET @ReturnValue = 0
END
RETURN @ReturnValue;
END
GO
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With