Can anyone explain the rules for how a wildcard character range, eg [A-D], works with a case-sensitive collation?
I would have thought the following
WHERE CharColumn LIKE '[A-D]%';
would return only records which start with an upper case A, B, C or D, and exclude records that start with a lower case a, b, c or d.
However, in reality, it appears to return records that start with an upper case A but also records that start with B or b, C or c and D or d. It's like only the first character of the range is case-sensitive and the remaining characters in the range are not case-sensitive.
On the other hand, the following
WHERE CharColumn LIKE '[ABCD]%';
does only return records which start with an upper case A, B, C or D. Yet I would have thought [A-D] would be equivalent to [ABCD].
I get the same results in SQL Server 2005 and SQL Server 2008 R2.
Example:
(insert statements written with SQL Server 2008 row constructors for compactness. If each value is given its own insert statement the script will work in SQL Server 2005)
CREATE TABLE #TEST_LIKE_Patterns
(
ID INT IDENTITY(1,1),
CharColumn VARCHAR(100) COLLATE Latin1_General_CS_AS
);
--------------
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('aaa'), ('aAA'), ('AAA'), ('Aaa');
--------------
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('bbb'), ('bBB'), ('BBB'), ('Bbb');
--------------
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('ccc'), ('cCC'), ('CCC'), ('Ccc');
--------------
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('ddd'), ('dDD'), ('DDD'), ('Ddd');
--------------
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('eee'), ('eEE'), ('EEE'), ('Eee');
--------------
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('fff'), ('fFF'), ('FFF'), ('Fff');
--------------
-- Raw Data:
SELECT *
FROM #TEST_LIKE_Patterns;
SELECT *
FROM #TEST_LIKE_Patterns
WHERE CharColumn LIKE '[A-D]%';
-- Results:
/*
ID CharColumn
--------------
3 AAA
4 Aaa
5 bbb
6 bBB
7 BBB
8 Bbb
9 ccc
10 cCC
11 CCC
12 Ccc
13 ddd
14 dDD
15 DDD
16 Ddd
*/
SELECT *
FROM #TEST_LIKE_Patterns
WHERE CharColumn LIKE '[ABCD]%';
-- Results:
/*
ID CharColumn
--------------
3 AAA
4 Aaa
7 BBB
8 Bbb
11 CCC
12 Ccc
15 DDD
16 Ddd
*/
You need a binary collation as indicated in Md. Elias Hossain's answer.
The explanation is that ranges in the pattern syntax work off Collation sort order rules.
From BOL
In range searches, the characters included in the range may vary depending on the sorting rules of the collation.
So
;WITH T(C) AS
(
SELECT 'A' UNION ALL
SELECT 'B' UNION ALL
SELECT 'C' UNION ALL
SELECT 'D' UNION ALL
select 'a' union all
select 'b' union all
select 'c' union all
select 'd'
)
SELECT *
FROM T
ORDER BY C COLLATE Latin1_General_CS_AS
Returns
C
----
a
A
b
B
c
C
d
D
So the range A-D
excludes a
but includes the other 3 lower case letters under a CS
collation.
It can be done in either way:
a. Use COLLATE
while create the table as:
CREATE TABLE #TEST_LIKE_Patterns
(
ID INT IDENTITY(1,1),
CharColumn VARCHAR(100) COLLATE Latin1_General_BIN
);
b. Use COLLATE
while selecting data as
SELECT *
FROM #TEST_LIKE_Patterns
WHERE CharColumn LIKE '%[A-D]%' COLLATE Latin1_General_BIN;
try
SELECT *
FROM #TEST_LIKE_Patterns
WHERE CharColumn LIKE '[A-D]%' COLLATE Latin1_General_BIN;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With