Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File comparison in VB.Net

I need to know if two files are identical. At first I compared file sizes and creation timestamps, but that's not reliable enough. I have come up with the following code, that seems to work, but I'm hoping that someone has a better, easier or faster way of doing it.

Basically what I am doing, is streaming the file contents to byte arrays, and comparing thier MD5 hashes via System.Security.Cryptography.

Before that I do some simple checks though, since there is no reason to read through the files, if both file paths are identical, or one of the files does not exist.

Public Function CompareFiles(ByVal file1FullPath As String, ByVal file2FullPath As String) As Boolean

    If Not File.Exists(file1FullPath) Or Not File.Exists(file2FullPath) Then
        'One or both of the files does not exist.
        Return False
    End If

    If String.Compare(file1FullPath, file2FullPath, True) = 0 Then
        ' fileFullPath1 and fileFullPath2 points to the same file...
        Return True
    End If

    Dim MD5Crypto As New MD5CryptoServiceProvider()
    Dim textEncoding As New System.Text.ASCIIEncoding()

    Dim fileBytes1() As Byte, fileBytes2() As Byte
    Dim fileContents1, fileContents2 As String
    Dim streamReader As StreamReader = Nothing
    Dim fileStream As FileStream = Nothing
    Dim isIdentical As Boolean = False

    Try

        ' Read file 1 to byte array.
        fileStream = New FileStream(file1FullPath, FileMode.Open)
        streamReader = New StreamReader(fileStream)
        fileBytes1 = textEncoding.GetBytes(streamReader.ReadToEnd)
        fileContents1 = textEncoding.GetString(MD5Crypto.ComputeHash(fileBytes1))
        streamReader.Close()
        fileStream.Close()

        ' Read file 2 to byte array.
        fileStream = New FileStream(file2FullPath, FileMode.Open)
        streamReader = New StreamReader(fileStream)
        fileBytes2 = textEncoding.GetBytes(streamReader.ReadToEnd)
        fileContents2 = textEncoding.GetString(MD5Crypto.ComputeHash(fileBytes2))
        streamReader.Close()
        fileStream.Close()

        ' Compare byte array and return result.
        isIdentical = fileContents1 = fileContents2

    Catch ex As Exception

        isIdentical = False

    Finally

        If Not streamReader Is Nothing Then streamReader.Close()
        If Not fileStream Is Nothing Then fileStream.Close()
        fileBytes1 = Nothing
        fileBytes2 = Nothing

    End Try

    Return isIdentical
End Function
like image 924
Gertsen Avatar asked Dec 24 '22 22:12

Gertsen


1 Answers

I would say hashing the file is the way to go, It's how I have done it in the past.

Use Using statements when working with Streams and such, as they clean themselves up. Here is an example.

Public Function CompareFiles(ByVal file1FullPath As String, ByVal file2FullPath As String) As Boolean

If Not File.Exists(file1FullPath) Or Not File.Exists(file2FullPath) Then
    'One or both of the files does not exist.
    Return False
End If

If file1FullPath = file2FullPath Then
    ' fileFullPath1 and fileFullPath2 points to the same file...
    Return True
End If

Try
    Dim file1Hash as String = hashFile(file1FullPath)
    Dim file2Hash as String = hashFile(file2FullPath)

    If file1Hash = file2Hash Then
        Return True
    Else
        Return False
    End If

Catch ex As Exception
    Return False
End Try
End Function

Private Function hashFile(ByVal filepath As String) As String
    Using reader As New System.IO.FileStream(filepath, IO.FileMode.Open, IO.FileAccess.Read)
        Using md5 As New System.Security.Cryptography.MD5CryptoServiceProvider
            Dim hash() As Byte = md5.ComputeHash(reader) 
            Return System.Text.Encoding.Unicode.GetString(hash) 
        End Using
    End Using
End Function
like image 189
Nathan Avatar answered Jan 08 '23 13:01

Nathan