Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does SQL Server Wildcard Character Range, eg [A-D], work with Case-sensitive Collation?

Can anyone explain the rules for how a wildcard character range, eg [A-D], works with a case-sensitive collation?

I would have thought the following

WHERE CharColumn LIKE '[A-D]%';

would return only records which start with an upper case A, B, C or D, and exclude records that start with a lower case a, b, c or d.

However, in reality, it appears to return records that start with an upper case A but also records that start with B or b, C or c and D or d. It's like only the first character of the range is case-sensitive and the remaining characters in the range are not case-sensitive.

On the other hand, the following

WHERE CharColumn LIKE '[ABCD]%';

does only return records which start with an upper case A, B, C or D. Yet I would have thought [A-D] would be equivalent to [ABCD].

I get the same results in SQL Server 2005 and SQL Server 2008 R2.

Example:
(insert statements written with SQL Server 2008 row constructors for compactness. If each value is given its own insert statement the script will work in SQL Server 2005)

CREATE TABLE #TEST_LIKE_Patterns
    ( 
        ID INT IDENTITY(1,1),
        CharColumn VARCHAR(100) COLLATE Latin1_General_CS_AS
    );

--------------
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('aaa'), ('aAA'), ('AAA'), ('Aaa');
--------------    
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('bbb'), ('bBB'), ('BBB'), ('Bbb');
--------------
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('ccc'), ('cCC'), ('CCC'), ('Ccc');
--------------    
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('ddd'), ('dDD'), ('DDD'), ('Ddd');
--------------    
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('eee'), ('eEE'), ('EEE'), ('Eee');
--------------    
INSERT INTO #TEST_LIKE_Patterns (CharColumn)
VALUES ('fff'), ('fFF'), ('FFF'), ('Fff');
--------------

-- Raw Data:
SELECT *
FROM #TEST_LIKE_Patterns;

SELECT *
FROM #TEST_LIKE_Patterns
WHERE CharColumn LIKE '[A-D]%';

-- Results:
/*
ID   CharColumn
--------------
3    AAA
4    Aaa
5    bbb
6    bBB
7    BBB
8    Bbb
9    ccc
10   cCC
11   CCC
12   Ccc
13   ddd
14   dDD
15   DDD
16   Ddd
*/


SELECT *
FROM #TEST_LIKE_Patterns
WHERE CharColumn LIKE '[ABCD]%';    

-- Results:
/*
ID   CharColumn
    --------------
3    AAA
4    Aaa
7    BBB
8    Bbb
11   CCC
12   Ccc
15   DDD
16   Ddd
*/
like image 982
Simon Tewsi Avatar asked Dec 07 '11 04:12

Simon Tewsi


3 Answers

You need a binary collation as indicated in Md. Elias Hossain's answer.

The explanation is that ranges in the pattern syntax work off Collation sort order rules.

From BOL

In range searches, the characters included in the range may vary depending on the sorting rules of the collation.

So

;WITH T(C) AS
(
SELECT 'A' UNION ALL
SELECT 'B' UNION ALL
SELECT 'C' UNION ALL
SELECT 'D' UNION ALL
select 'a' union all
select 'b' union all
select 'c' union all
select 'd'
)
SELECT *
FROM T
ORDER BY C COLLATE Latin1_General_CS_AS

Returns

C
----
a
A
b
B
c
C
d
D

So the range A-D excludes a but includes the other 3 lower case letters under a CS collation.

like image 125
Martin Smith Avatar answered Oct 17 '22 05:10

Martin Smith


It can be done in either way:

a. Use COLLATE while create the table as:

CREATE TABLE #TEST_LIKE_Patterns
( 
    ID INT IDENTITY(1,1),
    CharColumn VARCHAR(100) COLLATE Latin1_General_BIN
);

b. Use COLLATE while selecting data as

SELECT *
FROM #TEST_LIKE_Patterns
WHERE CharColumn LIKE '%[A-D]%' COLLATE Latin1_General_BIN;
like image 20
Elias Hossain Avatar answered Oct 17 '22 06:10

Elias Hossain


try

SELECT *
FROM #TEST_LIKE_Patterns
WHERE CharColumn  LIKE '[A-D]%' COLLATE Latin1_General_BIN;
like image 36
Igor Borisenko Avatar answered Oct 17 '22 06:10

Igor Borisenko