I have a requirement wherein I have to create hashvalue which consist of all columns of a table. With Checksum
this can be done easily, but Checksum
is not recommended as per Microsoft:
If at least one of the values in the expression list changes, the list checksum will probably change. However, this is not guaranteed. Therefore, to detect whether values have changed, we recommend the use of CHECKSUM only if your application can tolerate an occasional missed change. Otherwise, consider using HashBytes instead. With a specified MD5 hash algorithm, the probability that HashBytes will return the same result, for two different inputs, is much lower compared to CHECKSUM.
HASHBYTES
accepts only 2 parameters (algorithm type, column)
Now the problem is even though HASHBYTES
is more reliable compared to checksum but there doesn't seem to be an easy way to create it on multiple columns.
An example in the checksum,
create table dbo.chksum_demo1
(
id int not null,
name varchar(25),
address varchar(250),
HashValue as Checksum (id,name,address)
CONSTRAINT PK_chksum_demo1 PRIMARY KEY (Id)
)
How can we do the above using Hashbytes
instead of checksum?
SQL Server has a built-in function called HashBytes to support data hashing. A good hashing algorithm has these properties: It is especially sensitive to small changes in the input. Minor changes to the document will generate a very different hash result.
In SQL, selecting any column is one of the easiest things as you have to type only the SELECT command and after that, the column name and the output will be the desired column. Syntax: SELECT (Column Name) FROM (Table Name);
You can use MD2, MD4, MD5, SHA, or SHA1 to create hashes of your data. These algorithms are limited up to 20 bytes only. In SQL Server 2012, we have an enhancement in this function and now it supports SHA2_256, SHA2_512 algorithms that can generate 32 and 64 bytes hash codes for the respective input.
The SQL hashbytes function can use a number of different algorithms, but none of the are reversible. They are all hashing algorithms, not encryption algorithms.
Copying the best suggestion from the supplied link here, and adding a where
to show it can be used:
select MBT.refID,
hashbytes(
'MD5',
(select MBT.* from (values(null))foo(bar) for xml auto)
) as [Hash]
from MyBaseTable as MBT
where MBT.SomeFilter='X'
https://www.sqlservercentral.com/forums/topic/suggestionsolution-for-using-hashbytes-across-entire-table
One method is concat()
the fields together along with a delimiter. For any dates format them to strings manually to control the formatting.
HashValue as HASHBYTES('SHA2_256', CONCAT(ID,'|',name,'|',address))
The delimiter is needed to handle empty fields so ID:1 Name:'' Address:'12' is different from ID:1 Name:'12' Address:''.
Use this:
SELECT *,
HASHBYTES('MD5', (SELECT ID, name, address FOR XML RAW))
FROM Table1
SELECT HASHBYTES('<algorithm>', CONCAT_WS('|', f1, f2, f3, f4 ...))
FROM Table1
algorithm>::= MD2 | MD4 | MD5 | SHA | SHA1 | SHA2_256 | SHA2_512
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With