Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's SQL Server's analogue of MySQL's unicode_ci collation?

As far as I understand, in MySQL unicode_ci (utf8_unicode_ci in particular) collations are meant to support all the characters regardless to locale.

I need to achieve the same with SQL Server 2008 R2. My database is going to contain data in very different languages (not limited to latin-based alphabets). I am not going to use non-Unicode strings at all. What collation should I chose?

like image 791
Ivan Avatar asked Feb 20 '11 20:02

Ivan


1 Answers

You might as well go with Latin1_General_CI_AI

The reason is that unicode data is stored using NVarchar fields, SQL Server is more flexible in that it can mix Varchar (1-byte) and NVarchar (2-byte) data. So to match UTF8, any collation would do. As for CI - every single collation in 2008 allows for the CI specification to be added (it is a checkbox in the UI "case sensitive" - unchecked for insensitive).

The last bit and some others like width are just additional tuning on SQL Server.

Point #2 from http://forums.mysql.com/read.php?103,187048,188748

utf8_unicode_ci is fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian.

If you require sorting for a particular language, where languages handle accents differently, you need a specific dictionary order - refer here http://msdn.microsoft.com/en-us/library/ms144250.aspx. Otherwise Latin1_General is based on Latin-US

like image 121
RichardTheKiwi Avatar answered Nov 10 '22 08:11

RichardTheKiwi