All our databases were installed using the default collation (Latin1_General_CI_AS
).
We plan to change the collation to allow clients to search the database with accent insensitivity.
Questions:
What are the negatives (if any) of having an accent insensitive database?
Are there any performance overheads for an accent insensitive database?
Why is the default for SQL Server collation accent sensitive; why would anyone want accent sensitive by default?
Binary is also the fastest sorting order. For more information, see the Binary collations section in this article. Sorts and compares data in SQL Server tables based on Unicode code points for Unicode data.
Collations in SQL Server provide sorting rules, case, and accent sensitivity properties to data. A collation defines bit patterns that represent each character in metadata of database. SQL Server supports storing objects that have different collations in database.
The collate clause is used for case sensitive and case insensitive searches in the columns of the SQL server. There are two types of collate clause present: SQL_Latin1_General_CP1_CS_AS for case sensitive. SQL_Latin1_General_CP1_CI_AS for case insensitive.
If you do not specify a collation, the column is assigned the default collation of the database. You can also use the database_default option in the COLLATE clause to specify that a column in a temporary table use the collation default of the current user database for the connection instead of tempdb.
Seriously, changing database collations is a royal pain. See this HOWTO from codeproject, and then think hard before you do it! This is the EASY way!
Firstly, you can permit searches of the database with accent insensitivity simply by specifying that as part of the search, you don't necessarily have to change the collation.
select * from TableName
where name collate Latin1_General_CI_AI like @parameter
Simple as. However, this will hurt the indexes.
An alternative is to supply a calculated field which you can index separately.
create table TableName(
ix int identity primary key,
name nvarchar(20) collate latin1_general_ci_as
)
go
alter table TableName
add name_AI as name collate latin1_general_CI_AI
go
create index IX_TableName_name_AI
on dbo.TableName(name_AI)
The example above puts it in the table, but you could just as well create an indexed view.
create view dbo.TableName_AI
with schemabinding
as
select ix,
name collate Latin1_general_CI_AI as name
from dbo.TableName
go
-- Need a unique clustered index first
create unique clustered index IX_TableName_AI_Clustered on dbo.TableName_AI(ix)
-- then the index for searching
create index IX_TableName_AI_name on dbo.TableName_AI(name)
Then, for accent-insensitive searches, use the view TableName_AI
.
To answer your specific questions:
In an accent insensitive database, accent sensitive searches will be slower.
Yes, but not so you would notice
It just is. Something has to be the default: If you don't like it don't use the default!
Think of it this way: "Hard" and "Herd" are not the same word. That one vowel difference is enough - even though they sound similar.
An accent difference (a vs. á) is somewhere between a case difference (A vs. a), and a letter difference (a vs e). You have to draw the line somewhere.
An accent affects the sound of the word and can make it have a different meaning, though I struggle to think of examples. I guess it makes more sense to someone who has words in their database in a language which makes use of accents.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With