Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need a case insensitive collation where ss != ß

For a specific column in a database running on SQL Server Express 2012 I need a collation where ss and ß are not considered the same when comparing strings. Also ä and ae, ö and oe and ü and ue should be considered different respectively. Latin1_General_CI_AS provides the latter, but ss and ß are not distinguished. That is, WHERE ThatColumn = 'Fass' would yield both Fass and Faß.

I would simply stick to BIN/BIN2, but I need case insensitivity. If nothing else works I'll have to use Latin1_General_BIN/Latin1_General_BIN2 and make sure everything is uppercase or lowercase myself. It would mean more work as I need to be able to retrieve the version with proper casing as well.

But if there's a collation that does what I need, please let me know. Thanks in advance!

Update: More information about the requirements: the database contains personal names from a legacy system which only supported ASCII characters. That is, names like Müller and Faß are stored as Mueller and Fass. In the new system the user will have a function to rename those persons, e.g. rename "Mueller" to "Müller". To find the entities that need renaming I need to search for rows containing e.g. "Fass". But as it is now the query also returns "Faß", which is not what I want. I still need/want case insensitivity as the user should be able to search for "fass" and still get "Fass".

There's more to the system, but I can definitively say that I need to distinguish between ss and ß, ä and ae etc.

like image 222
Andre Loker Avatar asked Nov 13 '22 06:11

Andre Loker


1 Answers

The collation SQL_Latin1_General_CP1_CI_AS considers 'ss' to be different from 'ß', so it may work for you. It's a legacy collation, so it might create incompatibilities with operating systems and application platforms. It might also have other quirks I don't know of.

A few years ago, I sketched out a kludgy workaround for a similar problem, and you can take a look at it here. (Click on the "Workarounds (1)" tab.) The problem in that case was regarding uniqueness in a key column, not string comparison results, so to apply it in your situation (and compare two columns to do a single string comparison) might be unfeasible.

like image 88
Steve Kass Avatar answered Dec 24 '22 12:12

Steve Kass