which collation do I need to choose? SQL Server 2008
I've found a nice and related post on stackoverflow.com regarding this question: How to choose collation of SQL Server database
So If I understand well (ref above link):
collation properties/parms
I need to create a database and will store Turkish and English, I'll choose CI and AI. I don't want case sensitive and no accent sensitive, so it's easy. I think this is clear for English, but Turkish has some special characters like üçö etc.
Question:
Since collation is not related to STORING data and I'll use NVARCHAR
, why should I choose collation Turkish_100_CI_AI
, I can also use Latin1_General_100_CI_AI
, which is also my default on my SQL Server. Both are Latin script.
It's the same question for storing ENGLISH and FRENCH in the same database... Why should I use French_100_CI_AI
in stead of Latin1_General_100_CI_AI
?
Can someone advice ? Am I wrong?
Collations in SQL Server provide sorting rules, case, and accent sensitivity properties for your data. Collations that are used with character data types, such as char and varchar, dictate the code page and corresponding characters that can be represented for that data type.
The SQL_Latin1_General_CP1_CI_AS collation is a SQL collation and the rules around sorting data for unicode and non-unicode data are different. The Latin1_General_CI_AS collation is a Windows collation and the rules around sorting unicode and non-unicode data are the same.
The collate clause is used for case sensitive and case insensitive searches in the columns of the SQL server. There are two types of collate clause present: SQL_Latin1_General_CP1_CS_AS for case sensitive. SQL_Latin1_General_CP1_CI_AS for case insensitive.
Server-level collation for Microsoft SQL Server If you don't choose a different collation, the server-level collation defaults to SQL_Latin1_General_CP1_CI_AS. The server collation is applied by default to all databases and database objects. You can't change the collation when you restore from a DB snapshot.
You can set the collation explicitly for each column using the COLLATE clause, if your data model allows you to separate data into language-specific columns.
You can also apply the COLLATE clause to a SELECT statement (e.g. you keep all language data in the same place, and only filter by language in a SELECT).
As far as I'm aware Turkish (sort order) is not covered by Latin1.
Collation refers to a set of rules that determine how data is sorted and compared. Character data is sorted using rules that define the correct character sequence, with options for specifying case-sensitivity, accent marks, kana character types and character width.
Case sensitivity
If A and a, B and b, etc. are treated in the same way then it is case-insensitive. A computer treats A and a differently because it uses ASCII code to differentiate the input.
Accent sensitivity
If a and á, o and ó are treated in the same way, then it is accent-insensitive. A computer treats a and á differently because it uses ASCII code for differentiating the input. Like The ASCII value of a is 97 and á is 225.
Kana Sensitivity
When Japanese kana characters Hiragana and Katakana are treated differently, it is called Kana sensitive.
Width sensitivity
When a single-byte character (half-width) and the same character when represented as a double-byte character (full-width) are treated differently then it is width sensitive.
More info can be found here. I hope this answer helped.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With