Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting a column against different collations in SQL Server

I have a Companies table with a multilingual CompanyName column (defined as nvarchar(512)).

I also have a stored procedure to search through and return a list of companies, it takes 2 nvarchar parameters - one is the search term, and the other is an ISO language code.

What I would like to be able to do is return the results of the search sorted with a collation appropriate to the language code supplied in the second parameter, for example:

SELECT * FROM dbo.Companies WHERE CompanyName LIKE '%searchstring%'
ORDER BY
    CASE @lang
        WHEN 'sv' THEN CompanyName COLLATE Sami_Sweden_Finland_100_CI_AI
        WHEN 'ch' THEN CompanyName COLLATE Chinese_Simplified_Pinyin_100_CI_AI
                ELSE CompanyName
    END

however I get the following error when I try to run it:

Cannot resolve the collation conflict between "Chinese_Simplified_Pinyin_100_CI_AI" and "Sami_Sweden_Finland_100_CI_AI" in the CASE operation.

This makes no sense to me - it's not like I'm sorting across collations, why would there be a collation conflict? It's a mutually exclusive choice.

I've tried not to be too clever and just use dynamic sql, unfortunately that appears to be leaving the database unable to cache an execution plan, hence query times are taking upwards of 20 seconds (as opposed to 1) when the table contains around 2 million rows.

I'm sure that culture sensitive sorting must be a common problem, does anyone know of a good solution that doesn't involve altering the current schema (i.e. like having to create additional columns)?

like image 701
Jonathan Hoult Avatar asked Oct 13 '22 22:10

Jonathan Hoult


2 Answers

Acolumn can only have one collation. The CASE measn you're tryingt to have multiple that conflicts with each other. Try this:

SELECT * FROM dbo.Companies WHERE CompanyName LIKE '%searchstring%'
ORDER BY
    CASE @lang
        WHEN 'sv' THEN CompanyName ELSE '' END 
    END COLLATE Sami_Sweden_Finland_100_CI_AI,
    CASE @lang
        WHEN 'ch' THEN CompanyName ELSE '' END 
    END Chinese_Simplified_Pinyin_100_CI_AI
like image 98
gbn Avatar answered Nov 03 '22 23:11

gbn


Okay I've finally worked out the answer to this, thanks to @gbn, your solution was definitely correct given my initial definition of the question.

However it turns out that I was working under the mistaken assumption that Dynamic SQL should be avoided. To get this to work I had to add a computed (and indexed) column with a value of

CompanyName COLLATE [Collation_Name]

for every collation I needed to sort on (I realise I said in my original question that I didn't want to alter the schema but what I really meant was that I didn't want to move existing data).

When running the following query

EXEC('SELECT TOP 10 * FROM dbo.Companies
      WHERE CompanyName LIKE ''%searchstring%''
      ORDER BY CompanyName COLLATE ' + @collation)

the query engine picks up the index on the associated computed column and I get a quick response. Unfortunately when using the query provided by @gbn, the indexes created on the computed columns are ignored, with a consequent performance hit.

This is all very surprising to me as I've always believed that Dynamic SQL is an evil that should be avoided unless absolutely necessary (i.e when your query varies in a way that you can't parameterise)

like image 44
Jonathan Hoult Avatar answered Nov 03 '22 23:11

Jonathan Hoult