Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a faster way to gather data from workbooks than iterating Workbooks.Open()?

I'm working on a project where the data set is stored across thousands of Excel workbooks (currently ~14000) in a folder on a Sharepoint server, each of which gets regularly modified to reflect changes to the data subset in that workbook. I know it's not a good way to store and update this data, but that's the situation as is.

I have to perform various queries that involve extracting the contents of one or more fields from all workbooks in the folder. I've been doing this by iterating through the set of workbooks as follows:

Function getData() As Workbook

Dim resultBk As Workbook
Dim fldr As Folder
Dim fso As New FileSystemObject
Dim fileObj As File
Dim filePath As String
Dim queryBk As Workbook

'create a workbook for storing the query results
Set resultBk = Workbooks.Add(resultBkTemplatePath)

'get the folder with all the workbooks to be queried
Set fldr = fso.GetFolder(sharepointFolderPath)

For Each fileObj In fldr.Files
    'try opening each of the workbooks
    Set queryBk = Workbooks.Open(fileObj.Path, ReadOnly:=True)

    'get data from queryBk and add it to resultBk, or add a row with an error message if queryBk failed to open
    addBkDataToResults resultBk, queryBk

    queryBk.Close False
    Set queryBk = Nothing
Next

getData = resultBk

End Function

This process of opening and closing so many workbooks across a Sharepoint connection is very slow; it generally takes 12-14 hours to run a complete query. Is there a faster way to read data out of an Excel book that doesn't need to open/close it? Or is there a way that I can handle the open/close process to make this faster?

like image 410
sigil Avatar asked Nov 13 '22 11:11

sigil


1 Answers

I would use ADO and connect to the Excel workbooks as databases, allowing you to issue SQL statements against them. I would definitely do this for reading the data, but also for the writing (see here).

Saving the data in Access would certainly be an improvement over saving it in Excel, and you could consider scheduling a compact every so often to keep the database size down.

like image 54
Zev Spitz Avatar answered Nov 15 '22 05:11

Zev Spitz