Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

VBA: Performance of multidimensional List, Array, Collection or Dictionary

I'm currently writing code to combine two worksheets containing different versions of data. Hereby I first want to sort both via a Key Column, combine 'em and subsequently mark changes between the versions in the output worksheet.

As the data amounts to already several 10000 lines and might some day exceed the lines-per-worksheet limit of excel, I want these calculations to run outside of a worksheet. Also it should perform better.

Currently I'm thinking of a Quicksort of first and second data and then comparing the data sets per key/line. Using the result of the comparison to subsequently format the cells accordingly.

Question

I'd just love to know, whether I should use:

List OR Array OR Collection OR Dictionary
OF Lists OR Arrays OR Collections OR Dictionaries

I have as of now been unable to determine the differences in codability and performance between this 16 possibilities. Currently I'm implementing an Array OF Arrays approach, constantly wondering whether this makes sense at all?

Thanks in advance, appreciate your input and wisdom!

like image 954
POe Avatar asked Mar 27 '13 10:03

POe


1 Answers

Some time ago, I had the same problem with the macro of a client. Additionally to the really big number of rows (over 50000 and growing), it had the problem of being tremendously slow from certain row number (around 5000) when a "standard approach" was taken, that is, the inputs for the calculations on each row were read from the same worksheet (a couple of rows above); this process of reading and writing was what made the process slower and slower (apparently, Excel starts from row 1 and the lower is the row, the longer it takes to reach there). I improved this situation by relying on two different solutions: firstly, setting a maximum number of rows per worksheet, once reached, a new worksheet was created and the reading/writing continued there (from the first rows). The other change was moving the reading/writing in Excel to reading from temporary .txt files and writing to Excel (all the lines were read right at the start to populate the files). These two modifications improved the speed a lot (from half an hour to a couple of minutes). Regarding your question, I wouldn't rely too much on arrays with a macro (although I am not sure about how much information contains each of these 10000 lines); but I guess that this is a personal decision. I don't like collections too much because of being less efficient than arrays; and same thing for dictionaries.

I hope that this "short" comment will be of any help.

like image 73
varocarbas Avatar answered Oct 13 '22 00:10

varocarbas