Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server case insensitive collation

What are the benefits/drawbacks of using a case insensitive collation in SQL Server (in terms of query performance)?

I have a database that is currently using a case-insensitive collation, and I don't really like it. I would very much like to change it to case sensitive. What should I be aware of when changing the collation?

like image 663
Klaus Byskov Pedersen Avatar asked Nov 17 '10 12:11

Klaus Byskov Pedersen


People also ask

Which collation is case insensitive?

A case-insensitive collation ignores the differences between uppercase and lowercase letters for string comparison and sorting, whereas a case-sensitive collation does not. For example, in case-insensitive collation, “A” and “a” are equal.

Is SQL_Latin1_General_CP1_CI_AS case sensitive?

Database collation For example, the default server-level collation in SQL Server for the "English (United States)" machine locale is SQL_Latin1_General_CP1_CI_AS , which is a case-insensitive, accent-sensitive collation.

Is SQL_Latin1_General_CP1_CI_AS the same as Latin1_General_CI_AS?

The SQL_Latin1_General_CP1_CI_AS collation is a SQL collation and the rules around sorting data for unicode and non-unicode data are different. The Latin1_General_CI_AS collation is a Windows collation and the rules around sorting unicode and non-unicode data are the same.

What is collate Latin1_General_CS_AS in SQL Server?

According the SQL Server Books Online, the characters included in range searches depend on the sorting rules of the collation. Collation Latin1_General_CS_AS uses dictionary order so both upper and lower case characters of the specified range are included.


4 Answers

If you change the collation on the database, you also have to change it on each column individually - they maintain the collation setting that was in force when their table was created.

create database CollTest COLLATE Latin1_General_CI_AI
go
use CollTest
go
create table T1 (
    ID int not null,
    Val1 varchar(50) not null
)
go
select name,collation_name from sys.columns where name='Val1'
go
alter database CollTest COLLATE Latin1_General_CS_AS
go
select name,collation_name from sys.columns where name='Val1'
go

Result:

name collation_name
---- --------------
Val1 Latin1_General_CI_AI

name collation_name
---- --------------
Val1 Latin1_General_CI_AI
like image 178
Damien_The_Unbeliever Avatar answered Oct 17 '22 09:10

Damien_The_Unbeliever


I can't find anything to confirm whether properly constructed queries work faster on a case-sensitive vs case-insensitive database (although I suspect the difference is negligible), but a few things are clear to me:

  1. If your business requirements don't ask for it, you are putting yourself up to a lot of extra work (this is the crux of both HLGEM and Damien_The_Unbeliever's answers).
  2. If your business requirements don't ask for it, you are setting yourself up for a lot of possible errors.
  3. Its way too easy to construct poorly performing queries in a case-insensitive database if a case sensitive lookup is required:

A query like:

... WHERE UPPER(GivenName) = 'PETER'

won't use an index on GivenName. You would think something like:

... WHERE GivenName = 'PETER' COLLATE SQL_Latin1_General_CP1_CS_AS

would work better, and it does. But for maximum performance you'd have to do something like:

... WHERE GivenName = 'PETER' COLLATE SQL_Latin1_General_CP1_CS_AS
    AND GivenName LIKE 'PETER'

(see this article for the details)

like image 32
BradC Avatar answered Oct 17 '22 08:10

BradC


I would say the biggest drawback to changing to a case sensitive collation in a production database would be that many, if not most, of your queries would fail because they are currently designed to ignore case.

I've not tried to change collation on an existing datbase, but I suspect it could be quite time consuming to do as well. You probably will have to lock your users out completely while the process happens too. Do not try this unless you have thoroughly tested on dev.

like image 31
HLGEM Avatar answered Oct 17 '22 10:10

HLGEM


If you change the database collation but not the server collation (and they then don't match as a result), watch out when using temporary tables. Unless otherwise specified in their CREATE statement, they will use the server's default collation rather than that of the database which may cause JOINs or other comparisons against your DB's columns (assuming they're also changed to the DB's collation, as alluded to by Damien_The_Unbeliever) to fail.

like image 31
MartW Avatar answered Oct 17 '22 10:10

MartW