Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

COLLATE in UDF does not work as expected

I have a table with text field. I want to select rows where text is in all caps. This code works as it should, and returns ABC:

SELECT txt
FROM (SELECT 'ABC' AS txt UNION SELECT 'cdf') t
WHERE 
txt COLLATE SQL_Latin1_General_CP1_CS_AS = UPPER(txt)

then I create UDF (as suggested here):

CREATE FUNCTION [dbo].[fnsConvert]
(
      @p NVARCHAR(2000) ,
      @c NVARCHAR(2000)
)
RETURNS NVARCHAR(2000)
AS
    BEGIN
        IF ( @c = 'SQL_Latin1_General_CP1_CS_AS' )
            SET @p = @p COLLATE SQL_Latin1_General_CP1_CS_AS
        RETURN @p    
    END

and run it as follows (which looks like an equivalent code to me):

SELECT txt
FROM (SELECT 'ABC' AS txt UNION SELECT 'cdf') t
WHERE 
dbo.fnsConvert(txt, 'SQL_Latin1_General_CP1_CS_AS') = UPPER(txt)

however, this returns ABC as well as cdf.

Why is that so, and how do I get this to work?

PS I need UDF here to be able to call case-sensitive comparison from .Net LINQ2SQL provider.

like image 737
avs099 Avatar asked Dec 19 '17 15:12

avs099


People also ask

What is Snowflake collation?

Collation allows you to specify alternative rules for comparing strings, which can be used to compare and sort data according to a particular language or other user-specified rules.

What is collate SQL_Latin1_General_CP1_CI_AS in select statement?

The collate clause is used for case sensitive and case insensitive searches in the columns of the SQL server. There are two types of collate clause present: SQL_Latin1_General_CP1_CS_AS for case sensitive. SQL_Latin1_General_CP1_CI_AS for case insensitive.

How do I change collation function in SQL Server?

You can change the collation of any new objects that are created in a user database by using the COLLATE clause of the ALTER DATABASE statement. This statement does not change the collation of the columns in any existing user-defined tables. These can be changed by using the COLLATE clause of ALTER TABLE.


Video Answer


2 Answers

A variable cannot have it's own collation. It will always use the server's default. Check this:

--I declare three variables, each of which get's its own collation - at least one might think so:

DECLARE @deflt VARCHAR(100) = 'aBc'; --Latin1_General_CI_AS in my system
DECLARE @Arab VARCHAR(100) = 'aBc' COLLATE Arabic_100_CS_AS_WS_SC;
DECLARE @Rom VARCHAR(100) = 'aBc' COLLATE Romanian_CI_AI

--Now check this. All three variables are seen as the system's default collation:

SELECT [name], system_type_name, collation_name
FROM sys.dm_exec_describe_first_result_set(N'SELECT @deflt AS Deflt, @Arab AS Arab, @Rom AS Rom'
                                          ,N'@deflt varchar(100), @Arab varchar(100),@Rom varchar(100)'
                                          ,0);

/*
name    system_type_name    collation_name
Deflt   varchar(100)        Latin1_General_CI_AS
Arab    varchar(100)        Latin1_General_CI_AS
Rom     varchar(100)        Latin1_General_CI_AS
*/

--Now we check a simple comparison of "aBc" against "ABC"

SELECT CASE WHEN @deflt = 'ABC' THEN 'CI' ELSE 'CS' END AS CheckDefault
      ,CASE WHEN @Arab = 'ABC' THEN 'CI' ELSE 'CS' END AS CheckArab
      ,CASE WHEN @Rom = 'ABC' THEN 'CI' ELSE 'CS' END AS CheckRom

/*CI    CI  CI*/

--But we can specify the collation for one given action!

SELECT CASE WHEN @deflt = 'ABC' THEN 'CI' ELSE 'CS' END AS CheckDefault
      ,CASE WHEN @Arab = 'ABC' COLLATE Arabic_100_CS_AS_WS_SC THEN 'CI' ELSE 'CS' END AS CheckArab
      ,CASE WHEN @Rom = 'ABC' COLLATE Romanian_CI_AI THEN 'CI' ELSE 'CS' END AS CheckRom

/*CI    CS  CI*/

--But a table's column will behave differently:

CREATE TABLE #tempTable(deflt VARCHAR(100)
                       ,Arab VARCHAR(100) COLLATE Arabic_100_CS_AS_WS_SC
                       ,Rom VARCHAR(100) COLLATE Romanian_CI_AI);

INSERT INTO #tempTable(deflt,Arab,Rom) VALUES('aBc','aBc','aBc');

SELECT [name], system_type_name, collation_name
FROM sys.dm_exec_describe_first_result_set(N'SELECT * FROM #tempTable',NULL,0);
DROP TABLE #tempTable;

/*
name    system_type_name    collation_name
deflt   varchar(100)        Latin1_General_CI_AS
Arab    varchar(100)        Arabic_100_CS_AS_WS_SC
Rom     varchar(100)        Romanian_CI_AI
*/

--This applys for declared table variables also. The comparison "knows" the specified collation:

DECLARE @TableVariable TABLE(deflt VARCHAR(100)
                            ,Arab VARCHAR(100) COLLATE Arabic_100_CS_AS_WS_SC
                            ,Rom VARCHAR(100) COLLATE Romanian_CI_AI);

INSERT INTO @TableVariable(deflt,Arab,Rom) VALUES('aBc','aBc','aBc');

SELECT CASE WHEN tv.deflt = 'ABC' THEN 'CI' ELSE 'CS' END AS CheckDefault
      ,CASE WHEN tv.Arab = 'ABC' THEN 'CI' ELSE 'CS' END AS CheckArab
      ,CASE WHEN tv.Rom = 'ABC' THEN 'CI' ELSE 'CS' END AS CheckRom
FROM @TableVariable AS tv

/*CI    CS  CI*/

UPDATE Some documentation

At this link You can read about the details. A collation does not change the value. It applys a rule (related to NOT NULL which does not change the values, but just adds the rule whether NULL can be set or not).

The documentation tells clearly

Is a clause that can be applied to a database definition or a column definition to define the collation, or to a character string expression to apply a collation cast.

And a bit later you'll find

  1. Creating or altering a database
  2. Creating or altering a table column
  3. Casting the collation of an expression

UPDATE 2: A suggestion for a solution

If you want to have control whether a comparison is done CS or CI you might try this:

DECLARE @tbl TABLE(SomeValueInDefaultCollation VARCHAR(100));
INSERT INTO  @tbl VALUES ('ABC'),('aBc');

DECLARE @CompareCaseSensitive BIT = 0;
DECLARE @SearchFor VARCHAR(100) = 'aBc';

SELECT *
FROM @tbl 
WHERE (@CompareCaseSensitive=1 AND SomeValueInDefaultCollation=@SearchFor COLLATE Latin1_General_CS_AS)
   OR (ISNULL(@CompareCaseSensitive,0)=0 AND SomeValueInDefaultCollation=@SearchFor COLLATE Latin1_General_CI_AS);

With @CompareCaseSensitive set to 1 it will return just the aBc, with NULL or 0 it will return both lines.

This is - for sure! - much better in performance than an UDF.

like image 100
Shnugo Avatar answered Oct 31 '22 03:10

Shnugo


Please try using BINARY_CHECKSUM Function, and no need to UDF Function:

SELECT txt
FROM (SELECT 'ABC' AS txt UNION SELECT 'cdf') t
WHERE 
BINARY_CHECKSUM(txt)= BINARY_CHECKSUM(UPPER(txt))
like image 31
mhd swe Avatar answered Oct 31 '22 01:10

mhd swe