Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why generated MD5 hash in sql server are not equal? [duplicate]

I have a table in SQL Server 2008 R2 that contain two field (WordHash, Word). This Hash field generated in C# and I need regenerate hash code for Word field in sql server.

But my problem is that generated MD5 hash in sql server and C# are different. I found below code to resolve this problem but still I have same problem.

SQL code:

CONVERT(NVARCHAR(32),HASHBYTES('MD5', 'some word'), 2)

After putting this code block to my query, I saw some wired result! This is my result:
My Query:

SELECT 
    [WordHash],
    convert(nvarchar(32),HASHBYTES('MD5', 'Analytics'),2) AS TestHash,
    convert(nvarchar(32),HASHBYTES('MD5', [Word]),2) AS SqlHash
FROM myTable

Result:

WordHash: A768CAA988605A2846599CF7E2D0C26A
TestHash: A768CAA988605A2846599CF7E2D0C26A
SqlHash F4AFA5FEF805F7F5163EC6402BAF61FF

Note that the 'Analytics' is one of records data in database.
Why TestHash & SqlHash are different while they generated from same code!?

like image 676
Mojtaba Avatar asked Mar 12 '14 13:03

Mojtaba


People also ask

Can MD5 be duplicated?

Yes, there can be collisions, but the chances of that happening are so incredibly small that I wouldn't worry about it unless you were literally tracking many billions of pieces of content. Show activity on this post. If you're really afraid of accidental collisions just do both MD5 and SHA1 hashes and compare them.

Is MD5 hash unique?

If MD5 hashes any arbitrary string into a 32-digit hex value, then according to the Pigeonhole Principle surely this can not be unique, as there are more unique arbitrary strings than there are unique 32-digit hex values.

Can 2 values have the same hash?

Two files can have the same MD5 hash even if there are different. As the MD5 algorithm can take an infinity of input and give a limited number of output, it's not impossible, even if the probability of collision is very low. So, you have the short answer now, let's take a look at an example and how to avoid this issue.

Can two strings have same hash?

Yes. Two different strings can absolutely give the same SHA256. If you know the hash it takes 2^256 evaluations to find another string that gives the same hash.


1 Answers

The issue is NVARCHAR and VARCHAR get hashed to different values. Both HASHBYTES('MD5', 'Analytics'), and [WordHash] are hashes of VARCHAR values but [Word] is a NVARCHAR.

select HASHBYTES('MD5',  'Analytics'), 'varchar'
union
select HASHBYTES('MD5', N'Analytics'), 'nvarchar'

--outputs
------------------------------------- --------
0xA768CAA988605A2846599CF7E2D0C26A    varchar
0xF4AFA5FEF805F7F5163EC6402BAF61FF    nvarchar

To fix this you must either change [Word] to be VARCHAR or re-compute [WordHash] using NVARCHAR values.

Some useful further reading: Comparing SQL Server HASHBYTES function and .Net hashing

like image 192
Scott Chamberlain Avatar answered Sep 30 '22 16:09

Scott Chamberlain