I have written the following SQL CLR
function in order to hash string values larger then 8000 bytes (the limit of input value of the T-SQL
built-it HASHBYTES
function):
[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true)]
public static SqlBinary HashBytes(SqlString algorithm, SqlString value)
{
HashAlgorithm algorithmType = HashAlgorithm.Create(algorithm.Value);
if (algorithmType == null || value.IsNull)
{
return new SqlBinary();
}
else
{
byte[] bytes = Encoding.UTF8.GetBytes(value.Value);
return new SqlBinary(algorithmType.ComputeHash(bytes));
}
}
It is working fine for Latin strings. For example, the following hashes are the same:
SELECT dbo.fn_Utils_GetHashBytes ('MD5', 'test'); -- 0x098F6BCD4621D373CADE4E832627B4F6
SELECT HASHBYTES ('MD5', 'test'); -- 0x098F6BCD4621D373CADE4E832627B4F6
The issue is it is not working with Cyrillic strings. For example:
SELECT dbo.fn_Utils_GetHashBytes ('MD5 ', N'даровете на влъхвите') -- NULL
SELECT HashBytes ('MD5 ',N'даровете на влъхвите') -- 0x838B1B625A6074B2BE55CDB7FCEA2832
SELECT dbo.fn_Utils_GetHashBytes ('SHA256', N'даровете на влъхвите') -- 0xA1D65374A0B954F8291E00BC3DD9DF655D8A4A6BF127CFB15BBE794D2A098844
SELECT HashBytes ('SHA2_256',N'даровете на влъхвите') -- 0x375F6993E0ECE1864336E565C8E14848F2A4BAFCF60BC0C8F5636101DD15B25A
I am getting NULL
for MD5
, although the code returns value if it is executed as console application. Could anyone tell what I am doing wrong?
Also, I've got the function from here and one of the comments says that:
Careful with CLR SP parameters being silently truncated to 8000 bytes - I had to tag the parameter with [SqlFacet(MaxSize = -1)] otherwise bytes after the 8000th would simply be ignored!
but I have tested this and it is working fine. For example, if I generate a hash of 8000 bytes string and a second hash of the same string plus one symbol, I get the hashes are different.
DECLARE @A VARCHAR(MAX) = '8000 bytes string...'
DECLARE @B VARCHAR(MAX) = @A + '1'
SELECT LEN(@A), LEN(@B)
SELECT IIF(dbo.fn_Utils_GetHashBytes ('MD5', @A + '1') = dbo.fn_Utils_GetHashBytes ('MD5', @B), 1, 0) -- 0
Should I worry about this?
To enable CLR integration we have to change the value of the configuration option "clr enabled" from 0 to 1 by using the sp_configure system stored procedure. In the next script, we are going to show the actual status of CLR integration and then enable it.
To determine if CLR is enabled, execute the following commands: EXEC SP_CONFIGURE 'show advanced options', '1'; RECONFIGURE WITH OVERRIDE; EXEC SP_CONFIGURE 'clr enabled';
CLR functions can be used to access external resources such as files, network resources, Web Services, other databases (including remote instances of SQL Server). This can be achieved by using various classes in the . NET Framework, such as System.IO , System. WebServices , System. Sql , and so on.
NET Framework common language runtime (CLR), instead of in Transact-SQL. An assembly in SQL Server is an object that references a managed application module (. dll file) that was created in the . NET Framework common language runtime. An assembly contains class metadata and managed code.
Encoding.UTF8.GetBytes(...)
SQL Server has no concept of UTF-8. Use UCS-2 (UTF-16) or ASCII. The encoding used must match what you'd pass to HASHBYTES
. You can easily see that HASHBYTES
will hash differently VARCHAR
vs. NVARCHAR
:
select HASHBYTES('MD5', 'Foo') -- 0x1356C67D7AD1638D816BFB822DD2C25D
select HASHBYTES('MD5', N'Foo') -- 0xB25FF0AD90D09D395090E8A29FF4C63C
Best would be to change the SQLCLR function to accept the bytes, not a string, and deal with the cast to VARBINARY
in the caller.
SELECT dbo.fn_Utils_GetHashBytes ('MD5', CAST(N'даровете на влъхвите' AS VARBINARY(MAX));
FYI SQL Server 2016 has lifted the 8000 bytes restriction on HASHBYTES
:
For SQL Server 2014 and earlier, allowed input values are limited to 8000 bytes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With