Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

VBScript - Storing SHA1 as Numeric or Binary Value in SQL Server

I'm currently storing my SHA1 value in SQL Server as char(40). I'm under the impression that I could possible increase the speed of my lookups by changing this field to a numeric value. However, I'm uncertain of what field/data type to use to store this in SQL Server and how to convert it in VBScript. Should I use number or decimal and how many digits do I need to use?

I have read somewhere that using Binary(20) is recommended. However, working with Binary values in VBScript doesn't seem to be too easy so I'm assuming that I'll be better off using a numeric value instead.

Currently this is my SHA1 function. I store the string value it returns in my char(40) field in the database and perform my lookups using the second bit of code below.

Private Function SHA1(s)
    Dim asc, enc, bytes, outstr, pos
    Set asc = CreateObject("System.Text.UTF8Encoding")
    Set enc = CreateObject("System.Security.Cryptography.SHA1CryptoServiceProvider")
    'Convert the string to a byte array and hash it
    bytes = asc.GetBytes_4(s) 'This is how you use .Net overloaded methods in VBScript
    bytes = enc.ComputeHash_2((bytes))
    outstr = ""
    'Convert the byte array to a hex string
    For pos = 1 To Lenb(bytes)
        outstr = outstr & LCase(Right("0" & Hex(Ascb(Midb(bytes, pos, 1))), 2))
    Next
    SHA1 = outstr
    Set asc = Nothing
    Set enc = Nothing
End Function

Here's my lookup function. It operates quite quickly already but I'm looking for any way I can to optimize my code. If I do use binary to store the data I'm going to have to use it when I look it up too. I suppose I could possibly use stored procedures which would allow me to use SQL Server functions to convert back and forth. Maybe that would be a better route. Please advise.

Function GetHTTPRefererIDBySHA1(s)
    Dim r
    Set r = Server.CreateObject("ADODB.Recordset")      
    r.open "SELECT httprefererid FROM httpreferer " & _
            "WHERE sha1 = '" & s & "'", con, adOpenForwardOnly, adLockReadOnly
    If Not (r.eof and r.bof) then
        GetHTTPRefererIDBySHA1 = r("httprefererid")
    End If
    r.close
    set r = nothing
End Function

Edit:
Thanks to ScottE and Google I was able to speed up my queries noticeably. Here's a little information on my solution.
1) I created a field called SHA1Bin. It's a field of type binary(20).
2) When I insert a new record I use a stored procedure. Because I'm not overly concerned about space, I save the raw httpreferer value and the SHA1 binary value of it in the same table and same row. My stored procedure converts the raw value to SHA1 binary using the HashBytes function (SQL Server 2008).
3) My SHA1 function in VBScript remains the same as above but I now use it when I do lookups. Here's a modified version of the GetReferer function:

Function GetHTTPRefererIDBySHA1(s)
    Dim r
    Set r = Server.CreateObject("ADODB.Recordset")      
    r.open "SELECT httprefererid FROM httpreferer WHERE " & _
            "sha1bin = CONVERT(binary(20), 0x" & SHA1(s) & ")", _
            tcon, adOpenForwardOnly, adLockReadOnly

    If Not (r.eof and r.bof) then
        GetHTTPRefererIDBySHA1 = r("httprefererid")
    Else
        '//Insert new record code intentionally omitted
    End If
    r.close
    set r = nothing
End Function
like image 430
HK1 Avatar asked Mar 19 '26 04:03

HK1


1 Answers

I think that you're relatively on the right track; however, there are a couple of things that you can do to make this a tad faster.

SHA1 Background

Wherever you read that SHA1 was using binary(20) is pretty much dead on. SHA1 is a 160-bit message (20 bytes) that we usually play with in it's raw format - as you already know since you're function converts that raw binary into a string.

Converting to NUMERIC

So regardless, 20 bytes is 20 bytes. You can't convert it to something else to make it perform faster for the database. Trying to convert it to a numeric will be unsuccessful as you will get an arithmetic overflow error (numeric only has space for 17 bytes).

How to Make it Better

You have half the battle done. You can keep the data as a character data type if it is easier to work with in VBScript. Alternatively, you could store it as a BINARY(20); this is the approach I take for my data warehouse projects. If you are going to keep it as a string, make it a CHAR(20) rather than a CHAR(40). The CHAR data type stores the number of bytes specified, even if half of them are empty (which is nearly the case for you). The one "gotcha" in this is that your function will like render a "0x..." at the front of the string which is technically not part of the value, but is necessary to indicate that the value is a binary when constructing your SQL statement. As such, you could use a CHAR(22) or just do the concatenation where necessary. In either case, by reducing the number of characters in the field definition, SQL performs fewer reads to get at your data, which will speed things up. Another data type alternative would be a VARCHAR, which will trim the whitespace at the end of the string (again, fewer reads makes for a happy query).

Aside from that, index it just as you have done. If you've not done so already, create an index on your SHA1 column and include the httprefererid in the index, your query will use only the index to do your select and will be the fastest that it can be as only the data elements necessary will have been read. This is called a covering index (because it covers your filter plus selected columns). That index would look something like:

create index ix_httpreferer_sha1 on dbo.httpreferer (sha1) include (httprefererid);

Hope that helps!

like image 81
scottE Avatar answered Mar 22 '26 00:03

scottE



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!