Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server 2008 and HashBytes

I have quite a large nvarchar which I wish to pass to the HashBytes function. I get the error:

"String or binary would be truncated. Cannot insert the value NULL into column 'colname', tbale 'table'; column does not allow nulls. UPDATE fails. The statement has been terminated."

Being ever resourceful, I discovered this was due to the HashBytes function having a maximum limit of 8000 bytes. Further searching showed me a 'solution' where my large varchar would be divided and hashed seperately and then later combined with this user defined function:

function [dbo].[udfLargeHashTable] (@algorithm nvarchar(4), @InputDataString varchar(MAX))
RETURNS varbinary(MAX)
AS
BEGIN
DECLARE
    @Index int,
    @InputDataLength int,
    @ReturnSum varbinary(max),
    @InputData varbinary(max)

SET @ReturnSum = 0
SET @Index = 1
SET @InputData = convert(binary,@InputDataString)
SET @InputDataLength = DATALENGTH(@InputData)

WHILE @Index <= @InputDataLength
BEGIN
    SET @ReturnSum = @ReturnSum + HASHBYTES(@algorithm, SUBSTRING(@InputData, @Index, 8000))
    SET @Index = @Index + 8000
END
RETURN @ReturnSum
END

which I call with:

set @ReportDefinitionHash=convert(int,dbo.[udfLargeHashTable]('SHA1',@ReportDefinitionForLookup))

Where @ReportDefinitionHash is int, and @ReportDefinitionForLookup is the varchar

Passing a simple char like 'test' produces a different int with my UDF than a normal call to HashBytes would produce.

Any advice on this issue?

like image 532
Bob Avatar asked Sep 15 '10 13:09

Bob


People also ask

What is SQL Server Hashbytes?

Specifies an expression that evaluates to a character or binary string to be hashed. The output conforms to the algorithm standard: 128 bits (16 bytes) for MD2, MD4, and MD5; 160 bits (20 bytes) for SHA and SHA1; 256 bits (32 bytes) for SHA2_256, and 512 bits (64 bytes) for SHA2_512.

Does SQL use hashing?

A hash is a number that is generated by reading the contents of a document or message. Different messages should generate different hash values, but the same message causes the algorithm to generate the same hash value. SQL Server has a built-in function called HashBytes to support data hashing.

How decrypt sha256 SQL Server?

Answers. It is not possible to decrypt a hash. This is because hashing does not encrypt the original value at all. Hashing instead applies a one-way mathematical algorithm to the original value, resulting in a binary value.

What is hash join in SQL Server?

The hash join first scans or computes the entire build input and then builds a hash table in memory. Each row is inserted into a hash bucket depending on the hash value computed for the hash key. If the entire build input is smaller than the available memory, all rows can be inserted into the hash table.


2 Answers

If you can't create a function and have to use something that already exists in the DB:

sys.fn_repl_hash_binary

can be made to work using the syntax:

sys.fn_repl_hash_binary(cast('some really long string' as varbinary(max)))

Taken from: http://www.sqlnotes.info/2012/01/16/generate-md5-value-from-big-data/

like image 67
Nick Avatar answered Oct 23 '22 12:10

Nick


Just use this function (taken from Hashing large data strings with a User Defined Function):

create function dbo.fn_hashbytesMAX
    ( @string  nvarchar(max)
    , @Algo    varchar(10)
    )
    returns varbinary(20)
as
/************************************************************
*
*    Author:        Brandon Galderisi
*    Last modified: 15-SEP-2009 (by Denis)
*    Purpose:       uses the system function hashbytes as well
*                   as sys.fn_varbintohexstr to split an 
*                   nvarchar(max) string and hash in 8000 byte 
*                   chunks hashing each 8000 byte chunk,,
*                   getting the 40 byte output, streaming each 
*                   40 byte output into a string then hashing 
*                   that string.
*
*************************************************************/
begin
     declare    @concat       nvarchar(max)
               ,@NumHash      int
               ,@HASH         varbinary(20)
     set @NumHash = ceiling((datalength(@string)/2)/(4000.0))
    /* HashBytes only supports 8000 bytes so split the string if it is larger */
    if @NumHash>1
    begin
                                                        -- # * 4000 character strings
          ;with a as (select 1 as n union all select 1) -- 2 
               ,b as (select 1 as n from a ,a a1)       -- 4
               ,c as (select 1 as n from b ,b b1)       -- 16
               ,d as (select 1 as n from c ,c c1)       -- 256
               ,e as (select 1 as n from d ,d d1)       -- 65,536
               ,f as (select 1 as n from e ,e e1)       -- 4,294,967,296 = 17+ TRILLION characters
               ,factored as (select row_number() over (order by n) rn from f)
               ,factors as (select rn,(rn*4000)+1 factor from factored)

          select @concat = cast((
          select right(sys.fn_varbintohexstr
                         (
                         hashbytes(@Algo, substring(@string, factor - 4000, 4000))
                         )
                      , 40) + ''
          from Factors
          where rn <= @NumHash
          for xml path('')
          ) as nvarchar(max))


          set @HASH = dbo.fn_hashbytesMAX(@concat ,@Algo)
    end
     else
     begin
          set @HASH = convert(varbinary(20), hashbytes(@Algo, @string))
     end

return @HASH
end

And the results are as following:

select 
 hashbytes('sha1', N'test') --native function with nvarchar input
,hashbytes('sha1', 'test') --native function with varchar input 
,dbo.fn_hashbytesMAX('test', 'sha1') --Galderisi's function which casts to nvarchar input
,dbo.fnGetHash('sha1', 'test') --your function

Output:

0x87F8ED9157125FFC4DA9E06A7B8011AD80A53FE1  
0xA94A8FE5CCB19BA61C4C0873D391E987982FBBD3  
0x87F8ED9157125FFC4DA9E06A7B8011AD80A53FE1   
0x00000000AE6DBA4E0F767D06A97038B0C24ED720662ED9F1
like image 20
Denis Valeev Avatar answered Oct 23 '22 11:10

Denis Valeev