Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate md5-hashes for large files with VBA?

Tags:

excel

hash

vba

I have the following functions to generate md5-hashes for files. The functions work great for small files, but crashes and generate Run-time error 7 - Out of memory when I try to hash files over ~250 MB (I don't actually know at which exact size it breaks, but files below 200 MB work fine).

I don't understand why it breaks at a certain size, so if anyone could shed some light on that I would appreciate it a lot.

Also, is there anything I can do to make the functions handle larger files? I intend to use the functions in a larger tool where I will need to generate hashes for files of unknown sizes. Most will be small enough for the current functions to work, but I will have to be able to handle large files as well.

I got my current functions from the most upvoted answer this post How to get the MD5 hex hash for a file using VBA?

Public Function FileToMD5Hex(ByVal strFileName As String) As String
Dim varEnc           As Variant
Dim varBytes         As Variant
Dim strOut           As String
Dim intPos           As Integer

Set varEnc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")

'Convert the string to a byte array and hash it
varBytes = GetFileBytes(strFileName)
varBytes = varEnc.ComputeHash_2((varBytes))

'Convert the byte array to a hex string
For intPos = 1 To LenB(varBytes)
   strOut = strOut & LCase(Right("0" & Hex(AscB(MidB(varBytes, intPos, 1))), 2))
Next

FileToMD5Hex = strOut

Set varEnc = Nothing

End Function

Private Function GetFileBytes(ByVal strPath As String) As Byte()
Dim lngFileNum          As Long
Dim bytRtnVal()         As Byte

lngFileNum = FreeFile

'If file exists, get number of bytes
If LenB(Dir(strPath)) Then
   Open strPath For Binary Access Read As lngFileNum
   ReDim bytRtnVal(LOF(lngFileNum)) As Byte
   Get lngFileNum, , bytRtnVal
   Close lngFileNum
Else
   MsgBox "Filen finns inte" & vbCrLf & "Avbryter", vbCritical, "Filen hittades inte"
   Exit Function
End If

GetFileBytes = bytRtnVal
Erase bytRtnVal

End Function

Thank you

like image 302
mejchei Avatar asked Mar 31 '16 08:03

mejchei


People also ask

Can 2 files have the same MD5?

Generally, two files can have the same md5 hash only if their contents are exactly the same. Even a single bit of variation will generate a completely different hash value. There is one caveat, though: An md5 sum is 128 bits (16 bytes).

How long does it take to generate MD5 checksum?

It generally takes 3-4 hours to transfer via NC and then 40 minutes to get the md5sum. The security of the hash is not an issue in this case.


1 Answers

It looks like you reached the memory limit. A better way would be to compute the MD5 of the file by block:

Public Function ComputeMD5(filepath As String) As String
  Dim buffer() As Byte, svc As Object, hFile%, blockSize&, i&
  blockSize = 2 ^ 16

  ' open the file '

  If Len(Dir(filepath)) Then Else Err.Raise 5, , "file not found" & vbCr & filepath

  hFile = FreeFile
  Open filepath For Binary Access Read As hFile

  ' allocate buffer '

  If LOF(hFile) < blockSize Then blockSize = ((LOF(hFile) + 1024) \ 1024) * 1024
  ReDim buffer(0 To blockSize - 1)

  ' compute hash '

  Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")

  For i = 1 To LOF(hFile) \ blockSize
    Get hFile, , buffer
    svc.TransformBlock buffer, 0, blockSize, buffer, 0
  Next

  Get hFile, , buffer
  svc.TransformFinalBlock buffer, 0, LOF(hFile) Mod blockSize
  buffer = svc.Hash

  ' cleanup '

  svc.Clear
  Close hFile

  ' convert to an hexa string '

  ComputeMD5 = String$(32, "0")

  For i = 0 To 15
     Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
  Next

End Function
like image 103
Florent B. Avatar answered Sep 21 '22 08:09

Florent B.