Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I calculate the equivalent to SQL Server (hashbytes('SHA1',[ColumnName])) in C#?

In my database I have a computed column that contains a SHA1 hash of a column called URLString which holds URLs (e.g. "http://xxxx.com/index.html").

I often need to query the table to find a specific URL based on the URLString column. The table contains 100K's and these queries take several seconds (using SQL Azure). Since URLs can be quite long, I cannot create an index on this column (above 450 bytes).

To speed things up I want to calculate the equivalent of SQL Server hashbytes('SHA1',[URLString]) from C# and query based on this value.

I tried the below code, but the value I get is different than the one calculated by the database.

var urlString = Encoding.ASCII.GetBytes(url.URLString); //UTF8 also fails
var sha1 = new SHA1CryptoServiceProvider();
byte[] hash = sha1.ComputeHash(urlString);

Am I missing something trivial here?
I'm open to other ideas that can solve the same problem (as long as they are supported by SQL Azure).

Example: in the database the automatically calculated SHA1 value of URL http://www.whatismyip.org/ is 0xAE66CA69A157186A511ED462153D7CA65F0C1BF7.

like image 791
Gilad Gat Avatar asked Jan 03 '13 20:01

Gilad Gat


3 Answers

You're likely getting bitten by character encoding differences:

http://weblogs.sqlteam.com/mladenp/archive/2009/04/28/Comparing-SQL-Server-HASHBYTES-function-and-.Net-hashing.aspx

You could try getting the bytes via Encoding.ASCII.GetBytes(url) or Encoding.Unicode.GetBytes(url) and see which one your db is using.

like image 111
JerKimball Avatar answered Oct 24 '22 23:10

JerKimball


Below are two methods that do hashing of string and of bytes. The HashBytes method returns Base64 of the resulting bytes but you can return just the bytes if you prefer them

public static string HashString(string cleartext)
{
    byte[] clearBytes = Encoding.UTF8.GetBytes(cleartext);
    return HashBytes(clearBytes);
}  

public static string HashBytes(byte[] clearBytes)
{
    SHA1 hasher = SHA1.Create();
    byte[] hashBytes =   hasher.ComputeHash(clearBytes);
    string hash = System.Convert.ToBase64String(hashBytes);
    hasher.Clear();
    return hash;
}
like image 36
Sten Petrov Avatar answered Oct 24 '22 23:10

Sten Petrov


The below code is equivalent to SQL Server's hashbytes('sha1')

using (SHA1Managed sha1 = new SHA1Managed()) {
    var hash = sha1.ComputeHash(Encoding.Unicode.GetBytes(input));
    var sb = new StringBuilder(hash.Length * 2);

    foreach (byte b in hash) {
        // can be "x2" if you want lowercase
        sb.Append(b.ToString("X2"));
    }

    string output = sb.ToString();
}
like image 38
Mike Minh Le Avatar answered Oct 24 '22 21:10

Mike Minh Le