In terms of memory consumption/ execution time, what's a more expensive way of adding items to a group
Redim Preserve myArray(1 To Ubound(myArray) + 1)
myArray(Ubound(myArray)) = myVal
Or
myCollection.Add myVal
Does the fastest one depend on myVal
, or vary with the size of the groups? Is there a faster method still?
I have the array/collection declared Privately in the declarations portion of a class if that makes a difference, but I'd like to know what is happening behind the scenes and which approach is generally faster (not in terms of readability or maintainability, just execution time)
OK, having run some tests adding many instances of 1 to groups and collections, my results are:
Results were all for approx 5 seconds of looping with this code:
Sub testtime()
Dim sttime As Double
Dim endtime As Double
Dim i As Long
Dim u As Long
i = 0
ReDim a(1 To 1) 'or Set c = New Collection
sttime = Timer
endtime = Timer + 5
Do Until Timer > endtime
u = UBound(a) + 1
ReDim Preserve a(1 To u)
a(u) = 1
'or c.Add 1
i = i + 1
Loop
endtime = Timer
Debug.Print (endtime - sttime) / i; "s", i; "iterations", Round(endtime - sttime, 3); "(ish) s"
End Sub
So it looks like for adding that item, with relatively large groups; adding to a collection is faster, but I'd like to know why?
ReDim Preserve myArray(1 To UBound(myArray) + 1)
That's inherently inefficient, and an unfair comparison. You're copying the entire array internally, every time you add an item. I would hope a Collection
is much more efficient than that. If not, use .NET's ArrayList
, which is deprecated in .NET since v2.0 introduced generics and List<T>
, but usable - and useful - in VBA (.NET generics can't be used in VBA).
An ArrayList
doesn't resize its internal _items
array every single time an item is added! Notice the comments:
// Adds the given object to the end of this list. The size of the list is // increased by one. If required, the capacity of the list is doubled // before adding the new element. // public virtual int Add(Object value) { Contract.Ensures(Contract.Result<int>() >= 0); if (_size == _items.Length) EnsureCapacity(_size + 1); _items[_size] = value; _version++; return _size++; } ... // Ensures that the capacity of this list is at least the given minimum // value. If the currect capacity of the list is less than min, the // capacity is increased to twice the current capacity or to min, // whichever is larger. private void EnsureCapacity(int min) { if (_items.Length < min) { int newCapacity = _items.Length == 0? _defaultCapacity: _items.Length * 2; // Allow the list to grow to maximum possible capacity (~2G elements) before encountering overflow. // Note that this check works even when _items.Length overflowed thanks to the (uint) cast if ((uint)newCapacity > Array.MaxArrayLength) newCapacity = Array.MaxArrayLength; if (newCapacity < min) newCapacity = min; Capacity = newCapacity; } }
https://referencesource.microsoft.com/#mscorlib/system/collections/arraylist.cs
I don't know about the internals of a VBA.Collection
, but if I had to guess, I would say it likely has a similar mechanism that avoids re-dimensioning the internal array every single time an item is added. But that's all moot, since nobody except Microsoft knows how VBA.Collection
is implemented.
What we can do though, is run benchmarks and compare - let's add one million values to an array, a collection, and heck, an ArrayList
:
Public Sub TestReDimArray()
Dim sut() As Variant
ReDim sut(0 To 0)
Dim i As Long
Dim t As Single
t = Timer
Do While UBound(sut) < 1000000
ReDim Preserve sut(0 To i)
sut(i) = i
i = i + 1
Loop
Debug.Print "ReDimArray added 1M items in " & Timer - t & " seconds."
End Sub
Public Sub TestCollection()
Dim sut As VBA.Collection
Set sut = New VBA.Collection
Dim i As Long
Dim t As Single
t = Timer
Do While sut.Count < 1000000
sut.Add i
i = i + 1
Loop
Debug.Print "Collection added 1M items in " & Timer - t & " seconds."
End Sub
Public Sub TestArrayList()
Dim sut As Object
Set sut = CreateObject("System.Collections.ArrayList")
Dim i As Long
Dim t As Single
t = Timer
Do While sut.Count < 1000000
sut.Add i
i = i + 1
Loop
Debug.Print "ArrayList added 1M items in " & Timer - t & " seconds."
End Sub
Here's the output:
ReDimArray added 1M items in 14.90234 seconds.
Collection added 1M items in 0.1875 seconds.
ArrayList added 1M items in 15.64453 seconds.
Note that referencing the 32-bit mscorlib.tlb
and early-binding the ArrayList
doesn't make much of a difference. Plus there's managed/COM interop overhead, and VBA doesn't support constructors, so the ArrayList
initializes with a capacity of 4
that doubles every time capacity is reached, i.e. when we insert the millionth item we've resized the internal array 19 times and end up with an internal capacity of 1,048,576 items.
So how is Collection
winning by that much anyway?
Because the array is being abused: resizing isn't what arrays do best, and resizing before every insert can't possibly go well.
Use arrays when you know the number of elements in advance:
Public Sub TestPopulateArray()
Dim sut(0 To 999999) As Variant
Dim i As Long
Dim t As Single
t = Timer
Do While i < 1000000
sut(i) = i
i = i + 1
Loop
Debug.Print "PopulateArray added 1M items in " & Timer - t & " seconds."
End Sub
Output:
PopulateArray added 1M items in 0.0234375 seconds.
That's roughly 10 times faster than adding the same number of items to a VBA.Collection
- well-used arrays are blazingly fast.
Keep array resizing to a minimum - avoid it as much as possible. If you don't know the number of items you're going to end up with, use a Collection
. If you do, use an explicitly sized Array
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With