Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a superfast way to read large files line-by-line in VBA?

Tags:

file-io

vba

I believe I have come up with a very efficient way to read very, very large files line-by-line. Please tell me if you know of a better/faster way or see room for improvement. I am trying to get better at coding, so any sort of advice you have would be nice. Hopefully this is something that other people might find useful, too.

It appears to be something like 8 times faster than using Line Input from my tests.

'This function reads a file into a string.                        '
'I found this in the book Programming Excel with VBA and .NET.    '
Public Function QuickRead(FName As String) As String
    Dim I As Integer
    Dim res As String
    Dim l As Long

    I = FreeFile
    l = FileLen(FName)
    res = Space(l)
    Open FName For Binary Access Read As #I
    Get #I, , res
    Close I
    QuickRead = res
End Function

'This function works like the Line Input statement'
Public Sub QRLineInput( _
    ByRef strFileData As String, _
    ByRef lngFilePosition As Long, _
    ByRef strOutputString, _
    ByRef blnEOF As Boolean _
    )
    On Error GoTo LastLine
    strOutputString = Mid$(strFileData, lngFilePosition, _
        InStr(lngFilePosition, strFileData, vbNewLine) - lngFilePosition)
    lngFilePosition = InStr(lngFilePosition, strFileData, vbNewLine) + 2
    Exit Sub
LastLine:
    blnEOF = True
End Sub

Sub Test()
    Dim strFilePathName As String: strFilePathName = "C:\Fld\File.txt"
    Dim strFile As String
    Dim lngPos As Long
    Dim blnEOF As Boolean
    Dim strFileLine As String

    strFile = QuickRead(strFilePathName) & vbNewLine
    lngPos = 1

    Do Until blnEOF
        Call QRLineInput(strFile, lngPos, strFileLine, blnEOF)
    Loop
End Sub

Thanks for the advice!

like image 422
Justin Avatar asked Sep 04 '09 01:09

Justin


1 Answers

My two cents…

Not long ago I needed reading large files using VBA and noticed this question. I tested the three approaches to read data from a file to compare its speed and reliability for a wide range of file sizes and line lengths. The approaches are:

  1. Line Input VBA statement
  2. Using the File System Object (FSO)
  3. Using Get VBA statement for the whole file and then parsing the string read as described in posts here

Each test case consists of three steps:

  1. Test case setup that writes a text file containing given number of lines of the same given length filled by the known character pattern.
  2. Integrity test. Read each file line and verify its length and contents.
  3. File read speed test. Read each line of the file repeated 10 times.

As you can notice, Step #3 verifies the true file read speed (as asked in the question) while Step #2 verifies the file read integrity and therefore simulates real conditions when string parsing is needed.

The following chart shows the test results for the File read speed test. The file size is 64M bytes for all tests, and the tests differ in line length that varies from 2 bytes (not including CRLF) to 8M bytes.

No idea why it is not displayed any longer :(

CONCLUSION:

  1. All the three methods are reliable for large files with normal and abnormal line lengths (please compare to Graeme Howard’s answer)
  2. All the three methods produce almost equivalent file reading speed for normal line lengths
  3. “Superfast way” (Method #3) works fine for extremely long lines while the other two don’t.
  4. All this is applicable to different Offices, different PCs, for VBA and VB6
like image 92
Argut Avatar answered Oct 05 '22 13:10

Argut