Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hashing more than 8000 bytes in SQL Server

SQL Server's hashing function HASHBYTES has an input limit of 8000 bytes.

How do you hash larger strings?

like image 461
SDReyes Avatar asked Oct 14 '11 15:10

SDReyes


3 Answers

You could write a SQL CLR function:

[Microsoft.SqlServer.Server.SqlFunction]
public static SqlBinary BigHashBytes(SqlString algorithm, SqlString data)
{
    var algo = HashAlgorithm.Create(algorithm.Value);

    var bytes = Encoding.UTF8.GetBytes(data.Value);

    return new SqlBinary(algo.ComputeHash(bytes));
}

And then it can be called in SQL like this:

--these return the same value
select HASHBYTES('md5', 'test stuff')
select dbo.BigHashBytes('md5', 'test stuff')

The BigHashBytes is only necessary if the length would be over 8k.

like image 171
Paul Tyng Avatar answered Oct 18 '22 11:10

Paul Tyng


You could hash 8k (or 4k or 2k) chunks of the input and then either concatenate those hashes or hash them into a new hash value. This might get difficult though if you have to create a similar algorithm (in an external .NET app for example) to compare hashes created outside of SQL Server.

Another option: Lean on SQL Server's CLR integration and perform the hashing in a .NET assembly.

like image 33
Paul Sasik Avatar answered Oct 18 '22 13:10

Paul Sasik


Like Paul's idea, one idea that comes to mind for chunking would be to store the hashed string in an XML column, with each chunk as a separate XML element.

like image 27
Shan Plourde Avatar answered Oct 18 '22 11:10

Shan Plourde