Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is Latin1_General_CS_AS not case sensitive?

For LIKE queries, the Latin1_General_CS_AS collation is not case-sensitive. According to a bug report to Microsoft, this was listed as "By Design".

However, the Latin1_General_Bin collation is also case-sensitive and works exactly as expected for LIKE queries.

You can see the difference in this simple query:

SELECT
    MyColumn AS Latin1_General_Bin
FROM MyTable
WHERE MyColumn LIKE '%[a-z]%' COLLATE Latin1_General_Bin;

SELECT
    MyColumn AS Latin1_General_CS_AS
FROM MyTable
WHERE MyColumn LIKE '%[a-z]%' COLLATE Latin1_General_CS_AS;

SQL Fiddle Demo.


My questions are:

  1. Why would this be considered "By Design" to be case-insensitive in LIKE?
  2. If this really is better, why is it a different behavior between the two case sensitive collations _Bin and _CS_AS?

I was going to standardize on Latin1_General_CS_AS for any case-sensitive databases going forward, but this seems like a subtle query bug waiting to happen.

like image 793
arserbin3 Avatar asked Jun 13 '14 16:06

arserbin3


People also ask

Which collation is case insensitive?

A case-insensitive collation ignores the differences between uppercase and lowercase letters for string comparison and sorting, whereas a case-sensitive collation does not. For example, in case-insensitive collation, “A” and “a” are equal.

Is MS SQL case sensitive?

SQL Server is, by default case insensitive; however, it is possible to create a case sensitive SQL Server database and even to make specific table columns case sensitive. The way to determine a database or database object is by checking its “COLLATION” property and look for “CI” or “CS” in the result.

What is collate Latin1_General_CS_AS in SQL Server?

According the SQL Server Books Online, the characters included in range searches depend on the sorting rules of the collation. Collation Latin1_General_CS_AS uses dictionary order so both upper and lower case characters of the specified range are included.

What is case sensitive collation in SQL Server?

Case sensitive collation considers the uppercase and lowercase versions of letters to be identical for sorting purposes. Here is an example to test Case sensitive (CS) and Case insensitive (CI) collation difference.


1 Answers

It is not a regular expression. The range [a-z] just means >='a' AND <='z'.

Under that collation that includes all letters except capital Z.

Under SQL_Latin1_General_CP1_CS_AS all except capital A fall within that sort order.


In case that is still not clear review the sort orders for the following; for the three different collations

SELECT * 
FROM (VALUES ('A'),('B'),('Y'),('Z'), ('a'),('b'),('y'),('z')) V(C)
ORDER BY C COLLATE Latin1_General_Bin 

You see that the binary collation has all the upper case letters together, the other two don't.

+--------------------+----------------------+-------------------------------+
| Latin1_General_Bin | Latin1_General_CS_AS | SQL_Latin1_General_CP1_CS_AS  |
+--------------------+----------------------+-------------------------------+
| A                  | a                    | A                             |
| B                  | A                    | a                             |
| Y                  | b                    | B                             |
| Z                  | B                    | b                             |
| a                  | y                    | Y                             |
| b                  | Y                    | y                             |
| y                  | z                    | Z                             |
| z                  | Z                    | z                             |
+--------------------+----------------------+-------------------------------+

This is documented in BOL

In range searches, the characters included in the range may vary depending on the sorting rules of the collation.

like image 158
Martin Smith Avatar answered Oct 12 '22 01:10

Martin Smith