Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate a hash for a set of rows in sql server

Is there any way in SQL Server 2012 to generate a hash of a set of rows and columns?

I want to generate a hash, store it on the parent record. The when an update comes in, I'll compare the incoming hash with the parent record hash and I'll know whether the data has changed.

So something like this would be nice:

SELECT GENERATEHASH(CONCATENATE(Name, Description, AnotherColumn))
FROM MyChildTable WHERE ParentId = 2 -- subset of data belong to parent record 2

"CONCATENATE" would be an aggregate function which would not only concat the columns, but also, the rows inside the resultset. Like MAX, but returning everything as a string concatenation.

Hopefully this helps you see what I mean anyway!

The fundamental problem I'm trying to solve is that my client's system perform imports of vast amounts of hierarchical data. If I can avoid processing through the use of hashes, then I would think this will save a lot of time. At the moment, the SP is running 300% slower when having to process duplicate data.

Many thanks

like image 928
krisdyson Avatar asked Aug 08 '12 10:08

krisdyson


3 Answers

select HashBytes('md5',convert(varbinary(max),(SELECT * FROM MyChildTable WHERE ParentId = 2 FOR XML AUTO)))

but HashBytes is limited to 8000 bytes only... you can make a function to get de Md5 for every 8000 bytes....

like image 108
jrionegro Avatar answered Oct 13 '22 19:10

jrionegro


You can use the CHECKSUM_AGG aggregate. it is made for that purpose.

like image 26
usr Avatar answered Oct 13 '22 19:10

usr


For single row hashes:

select HASHBYTES('md5', Name + Description + AnotherColumn)
FROM MyChildTable WHERE ParentId = 2

for table checksum:

select sum(checksum(Name + Description + AnotherColumn)*1.0)
FROM MyChildTable WHERE ParentId = 2
like image 2
juergen d Avatar answered Oct 13 '22 20:10

juergen d