Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change collation during a query in SQL Server? Proper sort for English/Cyrillic/EU languages

I am doing research for an existing system that has SQL Server as a backend. We're starting to store ad-hoc item title data in multiple languages (including Russian language in Cyrillic characters and other European languages). My question is about sorting this data, what are my options for changing the sort order on demand without changing the collation? Can I decide on the collation in the query itself?

Here's a sample query I'm playing with to try and see:

SELECT STR FROM (
SELECT N'а01' AS STR UNION
SELECT N'б02' AS STR UNION
SELECT N'в03' AS STR UNION
SELECT N'г04' AS STR UNION
SELECT N'д05' AS STR UNION
SELECT N'е06' AS STR UNION
SELECT N'ё07' AS STR UNION
SELECT N'ж08' AS STR UNION
SELECT N'з09' AS STR UNION
SELECT N'и10' AS STR UNION
SELECT N'й11' AS STR UNION
SELECT N'к12' AS STR UNION
SELECT N'л13' AS STR UNION
SELECT N'м14' AS STR UNION
SELECT N'н15' AS STR UNION
SELECT N'о16' AS STR UNION
SELECT N'п17' AS STR UNION
SELECT N'р18' AS STR UNION
SELECT N'с19' AS STR UNION
SELECT N'т20' AS STR UNION
SELECT N'у21' AS STR UNION
SELECT N'ф22' AS STR UNION
SELECT N'х23' AS STR UNION
SELECT N'ц24' AS STR UNION
SELECT N'ч25' AS STR UNION
SELECT N'ш26' AS STR UNION
SELECT N'щ27' AS STR UNION
SELECT N'ъ28' AS STR UNION
SELECT N'ы29' AS STR UNION
SELECT N'ь30' AS STR UNION
SELECT N'э31' AS STR UNION
SELECT N'ю32' AS STR UNION
SELECT N'я33' AS STR UNION
SELECT N'a34' AS STR UNION
SELECT N'b35' AS STR UNION
SELECT N'c36' AS STR UNION
SELECT N'd37' AS STR UNION
SELECT N'e38' AS STR UNION
SELECT N'f39' AS STR UNION
SELECT N'g40' AS STR UNION
SELECT N'h41' AS STR UNION
SELECT N'i42' AS STR UNION
SELECT N'j43' AS STR UNION
SELECT N'k44' AS STR UNION
SELECT N'l45' AS STR UNION
SELECT N'm46' AS STR UNION
SELECT N'n47' AS STR UNION
SELECT N'o48' AS STR UNION
SELECT N'p49' AS STR UNION
SELECT N'q50' AS STR UNION
SELECT N'r51' AS STR UNION
SELECT N's52' AS STR UNION
SELECT N't53' AS STR UNION
SELECT N'u54' AS STR UNION
SELECT N'v55' AS STR UNION
SELECT N'w56' AS STR UNION
SELECT N'x57' AS STR UNION
SELECT N'y58' AS STR UNION
SELECT N'z59' AS STR UNION
SELECT N'A60' AS STR UNION
SELECT N'B70' AS STR UNION
SELECT N'C71' AS STR UNION
SELECT N'D72' AS STR UNION
SELECT N'E73' AS STR UNION
SELECT N'F74' AS STR UNION
SELECT N'G75' AS STR UNION
SELECT N'H76' AS STR UNION
SELECT N'I77' AS STR UNION
SELECT N'J78' AS STR UNION
SELECT N'K79' AS STR UNION
SELECT N'L80' AS STR UNION
SELECT N'M81' AS STR UNION
SELECT N'N82' AS STR UNION
SELECT N'O83' AS STR UNION
SELECT N'P84' AS STR UNION
SELECT N'Q85' AS STR UNION
SELECT N'R86' AS STR UNION
SELECT N'S87' AS STR UNION
SELECT N'T88' AS STR UNION
SELECT N'U89' AS STR UNION
SELECT N'V90' AS STR UNION
SELECT N'W91' AS STR UNION
SELECT N'X92' AS STR UNION
SELECT N'Y93' AS STR UNION
SELECT N'Z94' AS STR
) AS SUBTABLE
ORDER BY STR ASC

The order I get is this:

a34
A60
b35
B70
c36
C71
d37
D72
e38
E73
f39
F74
g40
G75
h41
H76
i42
I77
j43
J78
k44
K79
l45
L80
m46
M81
n47
N82
o48
O83
p49
P84
q50
Q85
r51
R86
s52
S87
t53
T88
u54
U89
v55
V90
w56
W91
x57
X92
y58
Y93
z59
Z94
а01
б02
в03
г04
д05
е06
ё07
ж08
з09
и10
й11
к12
л13
м14
н15
о16
п17
р18
с19
т20
у21
ф22
х23
ц24
ч25
ш26
щ27
ъ28
ы29
ь30
э31
ю32
я33

It looks like some of the Cyrillic characters are similar to their English equivalents, are there ways to have those interleave as part of "A to Z"?

What other sort options are available in SQL Server?

If I can't do it in the query, what's the most robust collation to pick on a per-column basis?

like image 613
artlung Avatar asked Dec 29 '22 06:12

artlung


1 Answers

You can select a Cyrillic collation like:

order by STR collate Cyrillic_General_CI_AI

A full list of collations can be found with a call to:

select * from ::fn_helpcollations()

CS means Case Sensitive, AI means Accent Insensitive.

like image 69
Andomar Avatar answered Jan 14 '23 10:01

Andomar