Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL CLR function based on .net ComputeHash is not working with Cyrrilic

I have written the following SQL CLR function in order to hash string values larger then 8000 bytes (the limit of input value of the T-SQL built-it HASHBYTES function):

[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true)]
public static SqlBinary HashBytes(SqlString algorithm, SqlString value)
{
    HashAlgorithm algorithmType = HashAlgorithm.Create(algorithm.Value);

    if (algorithmType == null || value.IsNull)
    {
        return new SqlBinary();
    }
    else
    {
        byte[] bytes = Encoding.UTF8.GetBytes(value.Value);
        return new SqlBinary(algorithmType.ComputeHash(bytes));
    }
}

It is working fine for Latin strings. For example, the following hashes are the same:

SELECT dbo.fn_Utils_GetHashBytes ('MD5', 'test'); -- 0x098F6BCD4621D373CADE4E832627B4F6
SELECT HASHBYTES ('MD5', 'test');                 -- 0x098F6BCD4621D373CADE4E832627B4F6

The issue is it is not working with Cyrillic strings. For example:

SELECT dbo.fn_Utils_GetHashBytes ('MD5 ', N'даровете на влъхвите') -- NULL
SELECT HashBytes ('MD5 ',N'даровете на влъхвите') -- 0x838B1B625A6074B2BE55CDB7FCEA2832

SELECT dbo.fn_Utils_GetHashBytes ('SHA256', N'даровете на влъхвите') -- 0xA1D65374A0B954F8291E00BC3DD9DF655D8A4A6BF127CFB15BBE794D2A098844
SELECT HashBytes ('SHA2_256',N'даровете на влъхвите') -- 0x375F6993E0ECE1864336E565C8E14848F2A4BAFCF60BC0C8F5636101DD15B25A 

I am getting NULL for MD5, although the code returns value if it is executed as console application. Could anyone tell what I am doing wrong?


Also, I've got the function from here and one of the comments says that:

Careful with CLR SP parameters being silently truncated to 8000 bytes - I had to tag the parameter with [SqlFacet(MaxSize = -1)] otherwise bytes after the 8000th would simply be ignored!

but I have tested this and it is working fine. For example, if I generate a hash of 8000 bytes string and a second hash of the same string plus one symbol, I get the hashes are different.

DECLARE @A VARCHAR(MAX) = '8000 bytes string...'
DECLARE @B VARCHAR(MAX) = @A + '1'
SELECT LEN(@A), LEN(@B)

SELECT IIF(dbo.fn_Utils_GetHashBytes ('MD5', @A + '1') = dbo.fn_Utils_GetHashBytes ('MD5', @B), 1, 0) -- 0

Should I worry about this?

like image 344
gotqn Avatar asked Feb 08 '16 10:02

gotqn


People also ask

How do I set CLR enabled configuration?

To enable CLR integration we have to change the value of the configuration option "clr enabled" from 0 to 1 by using the sp_configure system stored procedure. In the next script, we are going to show the actual status of CLR integration and then enable it.

How can I tell if SQL Server is CLR enabled?

To determine if CLR is enabled, execute the following commands: EXEC SP_CONFIGURE 'show advanced options', '1'; RECONFIGURE WITH OVERRIDE; EXEC SP_CONFIGURE 'clr enabled';

What is a CLR function in SQL?

CLR functions can be used to access external resources such as files, network resources, Web Services, other databases (including remote instances of SQL Server). This can be achieved by using various classes in the . NET Framework, such as System.IO , System. WebServices , System. Sql , and so on.

What is CLR assembly in SQL Server?

NET Framework common language runtime (CLR), instead of in Transact-SQL. An assembly in SQL Server is an object that references a managed application module (. dll file) that was created in the . NET Framework common language runtime. An assembly contains class metadata and managed code.


1 Answers

 Encoding.UTF8.GetBytes(...)

SQL Server has no concept of UTF-8. Use UCS-2 (UTF-16) or ASCII. The encoding used must match what you'd pass to HASHBYTES. You can easily see that HASHBYTES will hash differently VARCHAR vs. NVARCHAR:

select HASHBYTES('MD5', 'Foo')  -- 0x1356C67D7AD1638D816BFB822DD2C25D
select HASHBYTES('MD5', N'Foo') -- 0xB25FF0AD90D09D395090E8A29FF4C63C

Best would be to change the SQLCLR function to accept the bytes, not a string, and deal with the cast to VARBINARY in the caller.

 SELECT dbo.fn_Utils_GetHashBytes ('MD5', CAST(N'даровете на влъхвите' AS VARBINARY(MAX));

FYI SQL Server 2016 has lifted the 8000 bytes restriction on HASHBYTES:

For SQL Server 2014 and earlier, allowed input values are limited to 8000 bytes.

like image 121
Remus Rusanu Avatar answered Sep 22 '22 06:09

Remus Rusanu