TSQL CHECKSUM conundrum

Question

SELECT BINARY_CHECKSUM('Clifton House, Thornaby Place, Teesdale South, Stockton-On-Tees, Cleveland, TS17 6SD')
SELECT BINARY_CHECKSUM('Clifton House, Teesdale South, Thornaby Place, Stockton-On-Tees, Cleveland, TS17 6SD')

SELECT BINARY_CHECKSUM('Glenfield Hospital, Groby Road, , Leicester, Leicestershire, LE3 9DZ')
SELECT BINARY_CHECKSUM('Glenfield Hospital, Groby Road, , Leicester, Leicestershire, LE3 9EJ')

Have a look at the above. The 2 pairs of addresses will generate the same checksum value even though there are differences in the text. It is my understanding that, whilst you cannot guarantee that CHECKSUM and BINARY_CHECKSUM will be different for any random content that they should be good for determining relatively small changes in a given row.

Interestingly these pairs of values demonstrate precisely the opposite. They are generating equal checksum values for very similar data values. These are in fact the only duplicate checksum values in a largish (680,000 record) table of addresses.

I am a little concerned that I have misunderstood the value of checksum in generating UPDATEs? Do I have to resort to a brute force field by field comparison to be absolutely certain of picking up a change in a row of data?

The original data for these examples was in 6 separate columns. I have reduced the code sample to a minimal state for clarity.

Sheena · Accepted Answer

Checksums are never completely foolproof. For a totally-always-correct-no-matter-what-solution brute force is the way. There are some less intense methods that can work though.

A checksum is pretty much a really simple hash. Try using hashbytes instead. then you can make use of a more chaotic algorithm like md5.

eg:

SELECT HASHBYTES('MD5', 'Glenfield Hospital, Groby Road, , Leicester, Leicestershire, LE3 9DZ')

Z.D. · Answer

try using HASHBYTES() with MD5 or SHA1 instead of BINARY_CHECKSUM. read the help for HASHBYTES() first...

TSQL CHECKSUM conundrum

Tags:

sql-server

tsql

Andrew Wiles

2 Answers

Sheena

Z.D.

Recent Activity

Donate For Us

TSQL CHECKSUM conundrum

Tags:

sql-server

tsql

Andrew Wiles

2 Answers

Sheena

Z.D.

Related questions

Recent Activity

Donate For Us